ERIC Number: ED395028
Record Type: RIE
Publication Date: 1989-Oct
Reference Count: N/A
Making Essay Test Scores Fairer with Statistics. ETS Program Statistics Research Technical Report No. 89-90.
Braun, Henry I.; Wainer, Howard
A desirable goal would be to develop a methodology for scoring essays so that the final grades are less affected by when or by whom each essay was read. It seems sensible to derive such grades by somehow adjusting the ratings originally given by each reader. This essay describes a solution that relies on statistical adjustment, using the context of the College Board's Advanced Placement program. Nonstatistical provisions, such as rater training, are in place to minimize the potential impact of rater differences on grades, but there is no simple way of getting a true score for an essay. The basic idea in using statistical thinking to help is to reduce the effect on scoring reliability of some of the sources of variability through calibrating readers and days on which essays are read. Estimating the relative stringency of raters and the scoring trends across time is made possible by the choice of experimental design developed by statisticians. An example illustrates the approach. Calibration experiments on five different Advanced Placement examinations showed that, in general, calibrated scores enhance reliability, but there are obstacles to overcome before the approach can be operationalized with actual essays. (Contains three tables and three references.) (SLD)
Publication Type: Reports - Evaluative
Education Level: N/A
Authoring Institution: Educational Testing Service, Princeton, NJ.
Identifiers: Advanced Placement Examinations (CEEB); Calibration; Fairness; Variability