NotesFAQContact Us
Search Tips
Peer reviewed Peer reviewed
Direct linkDirect link
ERIC Number: EJ876753
Record Type: Journal
Publication Date: 2005
Pages: 30
Abstractor: As Provided
Reference Count: 60
ISSN: ISSN-1543-4303
Resolving Score Differences in the Rating of Writing Samples: Does Discussion Improve the Accuracy of Scores?
Johnson, Robert L.; Penny, James; Gordon, Belita; Shumate, Steven R.; Fisher, Steven P.
Language Assessment Quarterly, v2 n2 p117-146 2005
Many studies have indicated that at least 2 raters should score writing assessments to improve interrater reliability. However, even for assessments that characteristically demonstrate high levels of rater agreement, 2 raters of the same essay can occasionally report different, or discrepant, scores. If a single score, typically referred to as an operational score, is to be reported to the examinee, then a method of resolving those differences must be applied to the ratings. Many score resolution methods are available to assessment practitioners, and the choice of resolution method might affect the reliability and the validity of the resulting operational scores. This study investigates the accuracy of scores when based on either (a) averaging the 2 discrepant ratings or (b) using discussion to obtain a consensus score. Two questions guided our investigation. First, when 2 raters disagree, does discussion improve the accuracy of the reported scores as compared to averaging the original scores? Second, is there evidence that raters are equally engaged in the resolution process, or does the use of discussion as a form of resolution allow the opportunity for one rater to dominate, or defer, to the other rater? (Contains 6 tables.)
Routledge. Available from: Taylor & Francis, Ltd. 325 Chestnut Street Suite 800, Philadelphia, PA 19106. Tel: 800-354-1420; Fax: 215-625-2940; Web site:
Publication Type: Journal Articles; Reports - Research
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A