ERIC Number: EJ1094275
Record Type: Journal
Publication Date: 2016
Pages: 15
Abstractor: As Provided
ISBN: N/A
ISSN: ISSN-0729-4360
EISSN: N/A
Managing Rater Effects through the Use of FACETS Analysis: The Case of a University Placement Test
Wu, Siew Mei; Tan, Susan
Higher Education Research and Development, v35 n2 p380-394 2016
Rating essays is a complex task where students' grades could be adversely affected by test-irrelevant factors such as rater characteristics and rating scales. Understanding these factors and controlling their effects are crucial for test validity. Rater behaviour has been extensively studied through qualitative methods such as questionnaires and think aloud protocols and quantitatively through the use of the multi-faceted Rasch model (MFRM) [Congdon, P.J., & McQueen, J. (2000). "The stability of rater severity in large-scale assessment programs." "Journal of Educational Measurement," 37(2), 163-178; Engelhard, G. (1992). "The measurement of writing ability with a multi-faceted Rasch model." "Applied Measurement in Education," 5(3), 171-191; Lumley, T., & McNamara, T.F. (1995). "Rater characteristics and rater bias: Implications for training." "Language Testing," 12(54), 54-71; Weigle, S.C. (1998). "Using FACETS to model rater training effects. "Language Testing," 15(2), 263-287]. While these studies have yielded a rich understanding of rater characteristics and rating, less is known about the use of quantitative analysis to help manage and make adjustments for differences in students' scores. This study uses the MFRM [Linacre, J.M. (1989). "Multi-faceted Rasch measurement." Chicago: MESA Press] to investigate raters' scoring behaviour and ascertain how it affects students' scores in a large-scale placement test. It proposes the use of the anchoring method within the MFRM to manage the placement of students where it is not possible to have all raters score all scripts. The analysis shows that raters, while mostly internally consistent, have different levels of severity despite training. These differences would significantly affect a student's placement in the test if no measures are taken to manage this problem. The MFRM also shows that a few raters may be scoring the essays in a more holistic manner over time probably due to the halo effect [Engelhard, G. (1998). "Evaluating the quality of ratings obtained from standard-setting judges." "Educational and Psychological Measurement," 58(2), 179-196]. The study demonstrates how the MFRM can reveal patterns in raters' scoring and most importantly the analysis yields data that allow targeted strategies to handle the practical issue of moderation of scores to manage rater differences.
Descriptors: Scoring, Item Response Theory, Student Placement, College Students, Scores, Interrater Reliability, Essay Tests, Holistic Approach, Writing Ability
Routledge. Available from: Taylor & Francis, Ltd. 325 Chestnut Street Suite 800, Philadelphia, PA 19106. Tel: 800-354-1420; Fax: 215-625-2940; Web site: http://www.tandf.co.uk/journals
Publication Type: Journal Articles; Reports - Research
Education Level: Higher Education; Postsecondary Education
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Grant or Contract Numbers: N/A