autoRasch: The Semi-automated Rasch Analysis

# install.packages("remotes")
remotes::install_github("fwijayanto/autoRasch", build_manual = TRUE, build_vignettes = TRUE)
library(autoRasch)
library(doParallel)
#> Loading required package: foreach
#> Loading required package: iterators
#> Loading required package: parallel

Computing the criteron score (IPOQ-LL/IPOQ-LL-DIF)

Utilizing the generalized partial credit model (GPCM) and the generalized partial credit model with DIF (GPCM-DIF), we develop a score as a criterion to judge the quality of an itemset within an original survey, called the In-plus-out-of-questionnaire log-likelihood (IPOQ-LL) and In-plus-out-of-questionnaire log-likelihood with DIF (IPOQ-LL-DIF), respectively.

For example, we have a 9-item original survey and we want to examine how good to estimate persons’ abilities using only item7, item8, and item9. To compute the IPOQ-LL score we simply run

grMap <- matrix(c(rep(0,50),rep(1,50)),ncol = 1, dimnames = list(c(1:100),c("cov")))
ipoqlldif_score <- compute_score(shortDIF, incl_set = c(1:4), type = "ipoqlldif", groups_map = grMap)
#> [1] "Estimation starts..."
#> [1] "...done!"
summary(ipoqlldif_score)
#> 
#> Score of the itemsets: 
#> 
#> IQ-LL:  -199.6182
#> OQ-LL:  NA
#> IPOQ-LL:  -199.6182

Furthermore, to compute multiple IPOQ-LL scores of several itemsets simultanously, we simply use

ipoqll_scores <- compute_scores(shortDIF, incl_sets = rbind(c(1:3),c(2:4)), type = "ipoqll", cores = 2)
ipoqll_scores[,1:8]
#>              IQ-LL     OQ-LL   IPOQ-LL item no. item no. item no. item no.
#> result.1 -109.8750 -102.6400 -212.5150        1        2        3       NA
#> result.2 -174.5305  -49.6028 -224.1333        2        3        4       NA
#>          iq-ll par.
#> result.1  -2.524330
#> result.2  -2.063306

Semi-automated Rasch analysis by searching the maximum of the (IPOQ-LL/IPOQ-LL-DIF) score

The IPOQ-LL obtains by totalling the IQ-LL and OQ-LL. Changing type = ipoqlldif means the IPOQ-LL-DIF score is computed, by considering the DIF effects, instead of the IPOQ-LL. This log-likelihood is a score for model comparison, which means that there are more items combinations to be compared in order to obtain the maximum. Hence, we conduct the semi-automated Rasch analysis using the IPOQ-LL score by running

setting <- autoRaschOptions()
setting$isHessian <- FALSE
stepwise_res <- stepwise_search(shortDIF, incl_set = c(1:4), cores = 2, 
                                groups_map = grMap, method = "fast", 
                                criterion = "ipoqlldif", isTracked = TRUE)
#> do full items estimation...
#> [1] "Estimation starts..."
#> [1] "...done!"
#> 4 : 1,2,3,4
#> do backward...
#> 3 : 2,3,4
#> do backward...
#> 2 : 2,4
#> do forward...
#> do backward...
#> 1 : 2
#> do forward...
#> ::: End of search :::

This stepwise_search() aims to search the maximum IPOQ-LL score over all items combinations possible. This maximum score correspond to the “best” itemset according to the semi-automated Rasch analysis. Therefore, to speed up the search, we implements parallelization in every step of the stepwise selection search. If isTracked = TRUE the function prints the combination of items which returns the highest IPOQ-LL score at every step.

Obtaining the analysis result, we could plot

plot_search(stepwise_res, type="l")

The plot show the highest IPOQ-LL scores in every possible number of items in the itemsets. The numbers in the plot represent the item(s) which are removed (and added) to obtained the plotted scores, compared to the previous step. For instance, starting with full items, the highest IPOQ-LL score for itemset consisting with 8 items is obtained by removing item1. Subsequently, the highest IPOQ-LL score for itemset consisting with 7 items is obtained by removing item2.