ERIC Number: EJ1037437
Record Type: Journal
Publication Date: 2014
Abstractor: As Provided
Reference Count: 57
Interrater Reliability Estimators Commonly Used in Scoring Language Assessments: A Monte Carlo Investigation of Estimator Accuracy
Morgan, Grant B.; Zhu, Min; Johnson, Robert L.; Hodge, Kari J.
Language Assessment Quarterly, v11 n3 p304-324 2014
Common estimators of interrater reliability include Pearson product-moment correlation coefficients, Spearman rank-order correlations, and the generalizability coefficient. The purpose of this study was to examine the accuracy of estimators of interrater reliability when varying the true reliability, number of scale categories, and number of essays rated. This research used Monte Carlo methods to draw samples from known population models to examine the accuracy of select estimators of interrater reliability between two raters. In addition to the estimates shown above, we included the polychoric correlation coefficient based on its alignment with the context in which student language assessments are rated. Although each estimator produced an estimate close to the population parameter, polychoric correlations provided the closest estimates with mean and median bias equal to 0.00 (SD = 0.05) across all conditions. The use of Pearson product-moment and Spearman rank-order correlation coefficients might result in the underestimation of interrater reliability by as much as a third.
Descriptors: Interrater Reliability, Correlation, Generalization, Scoring, Language Tests, Essays, Monte Carlo Methods, Second Language Learning
Routledge. Available from: Taylor & Francis, Ltd. 325 Chestnut Street Suite 800, Philadelphia, PA 19106. Tel: 800-354-1420; Fax: 215-625-2940; Web site: http://www.tandf.co.uk/journals
Publication Type: Journal Articles; Reports - Research
Education Level: N/A
Authoring Institution: N/A