An Investigation of Item Fit Statistics for Mixed IRT Models.

Chon, Kyong Hee

The purpose of this study was to investigate procedures for assessing model fit of IRT models for mixed format data. In this study, various IRT model combinations were fitted to data containing both dichotomous and polytomous item responses, and the suitability of the chosen model mixtures was evaluated based on a number of model fit procedures. To assess model fit for mixed format data, five fit indices were considered: PARSCALE's G[superscript 2], the generalized forms of Orlando and Thissen's (2000) S-X[superscript 2] and S-G[superscript 2], and Stone's (2000) pseudo-observed score based indices chi[superscript 2*] and G[superscript 2*]. To investigate the relative performance of the five item fit statistics, two simulation studies were conducted: Type I error and power studies. Using the model hierarchy for a set of the data generation models and calibration models, model-fitting and model-misfitting conditions were manipulated to obtain Type I error rates and power. Under the model-fitting conditions, Type I error rates were computed when identical pairs of mixed IRT models were used for data generation and calibration. On the other hand, empirical power was calculated under the model-misfitting conditions when data were calibrated using a simpler model than a generating model. Among the competing measures, the number correct score based indices S-X[superscript 2] and S-G[superscript 2] were found to be the most efficient and safest choice for assessing model fit for mixed format data across the study conditions. These indices performed well particularly with short tests. However, the pseudo-observed score indices, chi[superscript 2*] and G[superscript 2*], showed inflated Type I error rates in some simulation conditions. Thus, their applications appeared to be limited by specific measurement conditions, and need to be studied further for generalized uses. Consistent with the findings of current literature, the PARSCALE's G[superscript 2] index was rarely useful, although it provided reasonable results for long tests. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]