Getting ready

If the packages are not already installed on your computer use the function install.packages(). Then we include the libraries:

library(eRm)
library(iarm)
library(scales)
library(DT)

Item and person parameter

If data fit the Rasch model, information regarding the item and person parameter can be exported for further use (e.g. item difficulties and item thresholds for computer adaptive testing; person parameter for any further person-based calculations).

In the present script the R-code for the production of different useful tables, graphs and export functions regarding the item and person parameter is shown.

Setting path for outpout

If we want to save all produced output (e.g. a table of the item or person parameter) in a folder, we can indicate this. This spares us the need to type the whole path each time we want to save output. We create the folder and save the path to the path_output object below. In a Windows computer, we have to keep in mind that we have to replace the "\" by "/". In this example the copied path was "C:\Users\User\Documents\IARM\CPH".

path_output <- "C:/Users/User/Documents/IARM/CPH/"

Dichotomous Items

We start with the dichotomous items. Before the item parameters are estimated using the Rasch model, we might want to have a look at some useful tables, graphs and functions regarding the sum score of the assessment instrument.

Example 1: The AMTS data

For demonstration the AMTS data are used, which are included in the iarm package. The AMTS (Abbreviated Mental Test Score) consists of ten dichotomous items scored 0 and 1 and is used to identify patients with dementia. One point is given for each correct answer. The included dataset contains the data of 197 respondents. There are 13 variables (first, the three person variables id of patient, agegroup & gender, then the 10 AMTS items).

Before estimating the item parameters, it might be useful to have a short look at the data.

We load the amts data from the iarm package using the data() function:

data("amts")

We get a list of the names of the variables using the function names():

names(amts)
 [1] "id"       "agegrp"   "sex"      "age"      "time"     "address" 
 [7] "name"     "year"     "dob"      "month"    "firstww"  "monarch" 
[13] "countbac"

We make an object containing only items.

items <- amts[, -c(1, 2, 3)]

This way we remove the colums 1–3 (person variables); alternatively we could write

items <- amts[, c(4:13)]

in order to select the items (which are in columns 4 to 13).

To see the number of rows (persons) and number of columns (variables [in this case: items]) we use the dim() function.

dim(items)
[1] 197  10

To show the first six rows of the object items we use the head() function.

head(items)
  age time address name year dob month firstww monarch countbac
1   1    1       1    1    1   1     1       1       1        1
2   0    0       0    0    0   0     0       0       0        1
3   0    1       0    1    0   1     1       1       0        0
4   1    1       1    1    1   1     1       1       1        1
5   1    1       0    1    1   1     1       1       1        0
6   1    1       1    1    1   1     1       1       1        0

If we are interested in having a look at the distribution of the AMTS total score across the 197 respondents, we can, for example, produce a histogram. First, we calculate the sum score across the ten items using the rowSums() function.

score <- rowSums(items)

The we create histogram of the sum score using the hist() function.

hist(score)

Excercise: Select a subsample of items

If we don’t want to analyse all items of a dataset, we can select a subsample of items. For example, we can select items 5, 7, 8, and 9 from the items object.

itemsNEW <- items[, c(5, 7, 8, 9)]

Using the head() function we show the first six rows of the newly created object itemsNEW.

head(itemsNEW)
  year month firstww monarch
1    1     1       1       1
2    0     0       0       0
3    0     1       1       0
4    1     1       1       1
5    1     1       1       1
6    1     1       1       1

Estimation of item parameters (dichotomous)

The RM-function estimates the parameters of a Rasch model for dichotomous item responses by using conditional maximum likelihood estimation (CML). For the items the item easiness parameters are estimated. Further down it is demonstrated how the item easiness parameters can be converted to item difficulty parameters.

We estimate the parameter of a Rasch model for dichotomous item responses using the RM() function from the eRm package.

rmAMTS <- RM(items)

The RM() function returns an object which stores different values such as the log-likelihood value, the parameter estimates and their standard errors. We can get the names of the rmAMTS object using the names() function.

names(rmAMTS)
 [1] "X"           "X01"         "model"       "loglik"      "npar"       
 [6] "iter"        "convergence" "etapar"      "se.eta"      "hessian"    
[11] "betapar"     "se.beta"     "W"           "call"       

In the help page (?RM) we find that the estimated item easiness parameters are stored in betapar. Therefore, to extract and show for all selected items we need to run the following:

rmAMTS$betapar
     beta age     beta time  beta address     beta name     beta year 
   0.60226398   -0.05320746   -2.00187682    0.60226356   -0.14105077 
     beta dob    beta month  beta firstww  beta monarch beta countbac 
   1.77796324   -0.37712813    0.14901048   -0.18110994   -0.37712813 

Static tables (entry order, difficulty order)

To show the item parameters in a well-arranged table and to also show the standard error of estimation (SEM) of the items, the cbind() function can be used. As most people are more used to item difficulty as opposed to item easiness parameters, the item parameters are multiplied by “-1”. If we want, we can round the item parameters using the round() function to have a clearer table.

IDiffTable <- cbind(ItemDifficulty = -1 * rmAMTS$betapar, ItemDifficultySE = rmAMTS$se.beta)
round(IDiffTable, 2)
              ItemDifficulty ItemDifficultySE
beta age               -0.60             0.21
beta time               0.05             0.19
beta address            2.00             0.19
beta name              -0.60             0.21
beta year               0.14             0.19
beta dob               -1.78             0.26
beta month              0.38             0.19
beta firstww           -0.15             0.20
beta monarch            0.18             0.19
beta countbac           0.38             0.19

We now have a static table with items in entry order, item difficulties, and SEMs. We can sort this table by item difficulty.

IDiffTableorder <- IDiffTable[order(IDiffTable[, "ItemDifficulty"]), ]
round(IDiffTableorder, 2)
              ItemDifficulty ItemDifficultySE
beta dob               -1.78             0.26
beta age               -0.60             0.21
beta name              -0.60             0.21
beta firstww           -0.15             0.20
beta time               0.05             0.19
beta year               0.14             0.19
beta monarch            0.18             0.19
beta month              0.38             0.19
beta countbac           0.38             0.19
beta address            2.00             0.19

Dynamic table

In case of a html-document it might be nicer to have a dynamic table in which the items can be sorted interactively in differen ways (e.g. item order or item difficulty). If a “static” document, e.g., a Word-document, shall be produced, we have to deactivate the following code by putting a ‘#’ in front of the functions.

In order to be able to also sort the data with regard to item entry order, we first have to add a column with entry number to the data frame.

IDiffTableWithEntry <- cbind(entry = 1:10, IDiffTable)

Then, we use the datatable() function from the DT package to insert a dynamic table with rounded values.

datatable(as.data.frame(round(IDiffTableWithEntry, 3)))

Item-person map

A figure commonly presented in Rasch-papers is the person-item map. It shows the distributions of item and person parameters in the same figure and is often used to discuss the targeting of an assessment instrument. We create a person-item map using the plotPImap() function.

plotPImap(rmAMTS, sorted = TRUE)

In the top panel the grey bars form a bar chart showing the distribution of the person locations. Each item has its own line and the item location is shown as a dot. Note that this is different for for polytomous items. In the AMTS example above we see that the item dob is quite easy, while the item address is quite difficult.

Export of item parameters

Sometimes we might want to export the item parameters to another software. If, for example, we want to write object IDiffTable of item parameters to a CSV file we use the function write.csv().

write.csv(IDiffTable, file = paste(path_output, "IDiffTable.csv", sep = ""))

Polytomous items

Example 2: The DESC data

Up to now we have used the AMTS-data as an exemplary dataset for dichotomous items. Now we want to have a look at polytomous items. For demonstration purposes we use the DESC dataset included in the iarm package (with data from 799 patients). The DESC (Rasch-based depression screening) assesses depression severity with 10 items with higher scores indicating a higher depression severity. The DESC items have five response categories (0 = never to 4 = always). The DESC dataset contains data from 799 patients and 14 variables. The first four variables are person factors (ID, indication group, gender, agegroup), followed by the 10 DESC items.

Estimation of item parameters

In case of polytomous items we have to chose between the Partial credit model (PCM function) and the Rating scale model (RSM function). For further explanations regarding both models, please see the chapter on item thresholds (-> xxx).

In the following we use the PCM.

We define an object itemsDESC containing the items: without first 4 columns (person factors).

itemsDESC <- desc2[, -c(1:4)]

We estimate a partial credit model using the PCM() function from the eRm package.

pcmDESC <- PCM(itemsDESC)

We compute item-category threshold parameters using the thresholds() function and show item difficulty and thresholds for each item.

itemthresholds <- thresholds(pcmDESC)
itemthresholds

Design Matrix Block 1:
          Location Threshold 1 Threshold 2 Threshold 3 Threshold 4
DESC_2_1   0.65292    -0.40914    -0.24291     1.20347     2.06026
DESC_2_2   0.98857    -0.05231    -0.00417     1.51598     2.49478
DESC_2_3  -0.35516    -2.87772    -1.11056     0.63262     1.93504
DESC_2_4  -0.02759    -2.08191    -0.53246     0.60855     1.89547
DESC_2_5   0.88304     0.22493     0.14526     0.92916     2.23283
DESC_2_6   0.68453    -1.07368     0.10748     1.01861     2.68572
DESC_2_7   0.47966    -0.64090    -0.28742     0.95992     1.88703
DESC_2_8   0.31585    -1.58430    -0.47006     0.90555     2.41223
DESC_2_9  -0.01579    -1.85411    -0.90131     0.45180     2.24046
DESC_2_10  1.75650     1.30477     0.92155     2.20644     2.59322

The function thresholds() returns a list containing three objects: a vector with threshold parameters (threshpar), a vector with standard errors (se.thresh), and a list with location and threshold parameters (threshtable).

Static tables (entry order, difficulty order)

As we cannot export the list with location and threshold parameters directly to EXCEL, we create an object itemdiff containing the matrix with location and threshold parameters within the list.

itemdiff <- itemthresholds$threshtable$`1`
itemdiff
             Location Threshold 1  Threshold 2 Threshold 3 Threshold 4
DESC_2_1   0.65291965 -0.40914137 -0.242912433   1.2034742    2.060258
DESC_2_2   0.98857121 -0.05230655 -0.004165513   1.5159751    2.494782
DESC_2_3  -0.35515509 -2.87772262 -1.110562803   0.6326213    1.935044
DESC_2_4  -0.02758824 -2.08190657 -0.532464539   0.6085510    1.895467
DESC_2_5   0.88304446  0.22492597  0.145256684   0.9291611    2.232834
DESC_2_6   0.68453445 -1.07367698  0.107477966   1.0186123    2.685724
DESC_2_7   0.47965903 -0.64089670 -0.287416511   0.9599232    1.887026
DESC_2_8   0.31585465 -1.58430130 -0.470058953   0.9055468    2.412232
DESC_2_9  -0.01578898 -1.85410701 -0.901313981   0.4518011    2.240464
DESC_2_10  1.75649700  1.30477346  0.921550471   2.2064394    2.593225

We can order the items in relation to the item difficulty.

itemdifforder <- itemdiff[order(itemdiff[, "Location"]), ]
round(itemdifforder, 2)
          Location Threshold 1 Threshold 2 Threshold 3 Threshold 4
DESC_2_3     -0.36       -2.88       -1.11        0.63        1.94
DESC_2_4     -0.03       -2.08       -0.53        0.61        1.90
DESC_2_9     -0.02       -1.85       -0.90        0.45        2.24
DESC_2_8      0.32       -1.58       -0.47        0.91        2.41
DESC_2_7      0.48       -0.64       -0.29        0.96        1.89
DESC_2_1      0.65       -0.41       -0.24        1.20        2.06
DESC_2_6      0.68       -1.07        0.11        1.02        2.69
DESC_2_5      0.88        0.22        0.15        0.93        2.23
DESC_2_2      0.99       -0.05        0.00        1.52        2.49
DESC_2_10     1.76        1.30        0.92        2.21        2.59

Dynamic table

Like for the dichotomous data we could also produce a dynamic table for an html-document with dynamic ordering of items with regards to item difficulty and item order.

itemdiffWithEntry <- cbind(Entry = 1:10, itemdiff)

datatable(round(itemdiffWithEntry, 2))

Item-person map

We now want to have a look at the item-person map of the DESC data set. As the DESC has polytomous items, not only the item difficulty (black filled circle) is shown for each item, but also the item thresholds (empty circles). We can specify sorted = TRUE for the items to be shown in order of item difficulties.

plotPImap(pcmDESC, sorted = TRUE)

Write outfile of item parameters: Export to Excel

To export the above shown data (itemdifforder) as an csv.-file to excel, we use again the write.csv() function.

write.csv(itemdifforder, file = paste(path_output, "ItemDifficultyTable_PCM.csv",
    sep = ""))

Standard error of estimation (SEM) for thresholds

Static table

We create a data frame of the item thresholds and their corresponding SEMs.

SEthreshTable <- data.frame(Entry = 1:40, Beta = itemthresholds$threshpar, BetaSE = itemthresholds$se.thresh)

Person parameters

Most of the time, we will also be interested in the person parameters. They are used for conversion tables that provide the information about which interval scale level person estimate corresponds to which raw sum score of the scale. This enables everybody to base further evaluations and calculations on the person location estimates. These can be rescaled using any linear transformation if we think that users (e.g. clinicians) would rather work with person estimates ranging from, e.g., zero to 100. If we want to conduct any further calculations with the individual person scores it is recommendable to use the individual person estimates. This also enables us to use parametric statistics (like e.g. analysis of variance).

Person location estimates can also be used to get information regarding individual person fit. We can then investigate the percentage of people with misfitting responses and we can see on individual level whether we are allowed to interpret an individual person parameter (since the estimate only makes sense if the person fit is good).

For demonstration purposes we use the AMTS data set which was already used for estimating the item parameters.

Providing conversion tables

We use the person_estimates() function from the iarm package to compute person estimates with maximum likelihood estimation (MLE) and weighted likelihood estimation (WLE) for raw scores and provide a conversion table.

conversionTab <- iarm::person_estimates(rmAMTS)

If we specify properties = TRUE the function provides again the information how to convert the sum scores into person estimates, but this time additional information regarding person SEM & Bias; output presented seperatly for MLE and WLE estimation.

ppest <- person_estimates(rmAMTS, properties = TRUE)

We can also use the person.estimates() function from the eRm package to estimate person parameters.

personpar <- eRm::person.parameter(rmAMTS)
personpar

Person Parameters:

Person NA Group: 1 
NA pattern: x x x x x x x x x x 
 Raw Score    Estimate Std.Error
         0 -3.45975738        NA
         1 -2.49406106 1.0972872
         2 -1.59775965 0.8386474
         3 -0.98584021 0.7378874
         4 -0.47873949 0.6925663
         5 -0.01108283 0.6798199
         6  0.45810811 0.6948887
         7  0.97071966 0.7435132
         8  1.59514226 0.8491420
         9  2.51613719 1.1121728
        10  3.50935614        NA

Person NA Group: 2 
NA pattern: x NA x x x x x x x x 
 Raw Score Estimate Std.Error
         2 -1.47619 0.8569951

Providing a table with individual person parameter estimates

Often we need the individual person parameter for further calculations. Most of the person output only contains the data of those respondents who don’t have extreme values (highest or lowest possible score). If we want the person parameters for all persons we have to use the theta.table-function. People with extreme scores get interpolated person estimates. This matrix contains information regarding the person parameter of each person in entry order. Moreover, it is indicated for each person whether the person estimate had to be interpolated because of an extreme score. And the NAgroup is indicated which we already know from above: All patients who gave a valid response to all items belong to NAgroup “1”. If patients have missing data, they are assigned to another NAgroup. In the AMTS example there is only one additional NAgroup - with the response pattern of a missing response on item 2 (x NA x x x x x x x x).

personpar <- person.parameter(rmAMTS)

Rescaling of person estimates

Sometimes we might want to provide the person estimates in another format. We may apply any linear transformation to the person estimates. In the present example we want to rescale the person parameters such, that the minimum value is “0” and the maximum value “100”. We use the rescale() function from the scales package to do this. Note that since several rescale() functions exist in R we specify scales::rescale to make sure that the intended function is used.

pptrans <- personpar$theta.table
pptrans[, "person parameter recoded"] <- scales::rescale(pptrans[, 1], c(0, 100))
Rounding of estimates: numbers (same)

We now want to round the person parameter and the rescaled person parameter - both to 2 decimal places.

pptrans[, c(1, 4)] <- round(pptrans[, c(1, 4)], 2)

In order to be able to also sort the data with regard to person entry order, we add a column with entry order to the data frame. In case of the AMTS data from 1 to 197.

pptransWithEntry <- cbind(Entry = 1:nrow(pptrans), pptrans)
Rounding of estimates: numbers (different)

Sometimes we might want to round different variables in different ways, e.g. the “person parameter” with two decimal places and the “person parameter recoded” without any decimal places. We can do this as follows:

pptrans2 <- pptrans
pptrans2[, 1] <- round(pptrans2[, 1], 2)
pptrans2[, 4] <- round(pptrans2[, 4], 0)

We check that the rounding was successful.

Person Fit

In the chapter on item fit we already learnt about the different fit statistics with respect to the items. We learnt that the question of fit is about the difference between observed responses and the expected by the model. Applying this rationale to the persons we can investigate whether there are any persons responding in an unexpected way to the items. The percentage of people with unexpected response patterns should not be to high.

Using the personfit() function we produce a list with person fit statistics. The fit is only reported for the persons with unextreme scores.

pfit <- personfit(personpar)

The list pfit contains the following values:

names(pfit)
[1] "p.fit"        "p.df"         "st.res"       "p.outfitMSQ"  "p.infitMSQ"  
[6] "p.outfitZ"    "p.infitZ"     "excl_obs_num" "excl_obs_chr"

We would like to show a table of some of the statistics and create a matrix with the relevant values.

pfitmatrix <- as.data.frame(matrix(unlist(lapply(c(1, 2, 4:7), FUN = function(x) pfit[[x]])),
    ncol = 6))
colnames(pfitmatrix) <- names(pfit)[c(1, 2, 4:7)]
rownames(pfitmatrix) <- rownames(pfit[[3]])

Identify persons with misfit (at the moment via pvalue, shall we take another criterion?)

pfitmatrix$pvalue <- round(1 - pchisq(pfit$p.fit, pfit$p.df - 1), 3)

pfitmatrix[(1 - pchisq(pfit$p.fit, pfit$p.df - 1)) <= 0.05, ]
        p.fit p.df p.outfitMSQ p.infitMSQ p.outfitZ  p.infitZ pvalue
P2   18.83079   10    1.883079   1.279163 0.9740987 0.5876548  0.027
P18  20.24315   10    2.024315   1.436061 2.2813724 1.6777429  0.016
P51  25.75598   10    2.575598   1.814584 3.0188438 2.8747528  0.002
P63  34.93891    9    3.882101   1.049447 2.4459857 0.2599416  0.000
P92  36.39721   10    3.639721   1.563075 2.3970109 1.1163724  0.000
P115 30.87311   10    3.087311   1.663731 2.7608539 1.7769907  0.000
P131 20.37240   10    2.037240   1.041892 1.7640518 0.2347356  0.016
P170 23.84614   10    2.384614   1.368692 1.2122402 0.6802473  0.005
pfitmatrix$sig <- ""
pfitmatrix$sig[pfitmatrix$pvalue <= 0.05 & pfitmatrix$pvalue > 0.01] <- "*"
pfitmatrix$sig[pfitmatrix$pvalue <= 0.01] <- "**"

Order persons (e.g. according to Chisq)

In a static table we can order data according to, e.g., the Chi-squared statistics, as follows:

pfitmatrix[order(pfitmatrix[, "p.fit"]), ]

Personmisfit

We can count the number of persons who do not fit the Rasch model using the PersonMisfit() function.

personmisfitSum <- PersonMisfit(personpar)
summary(personmisfitSum)

 Percentage of Misfitting Persons: 
 0.6849 % 
 (i.e., 1 out of 146 persons have Chi-square based Z-values > 1.96)