NotesFAQContact Us
Search Tips
ERIC Number: ED492918
Record Type: Non-Journal
Publication Date: 2004-Feb
Pages: 48
Abstractor: Author
Beyond Essay Length: Evaluating e-rater[R]'s Performance on TOEFL[R] Essays. Research Reports. Report 73. RR-04-04
Chodorow, Martin; Burstein, Jill
Educational Testing Service
This study examines the relation between essay length and holistic scores assigned to Test of English as a Foreign Language[TM] (TOEFL[R]) essays by e-rater[R], the automated essay scoring system developed by ETS. Results show that an early version of the system, e-rater99, accounted for little variance in human reader scores beyond that which could be predicted by essay length. A later version of the system, e-rater01, performs significantly better than its predecessor and is less dependent on length due to its greater reliance on measures of topical content and of complexity and diversity of vocabulary. Essay length was also examined as a possible explanation for differences in scores among examinees with native languages of Spanish, Arabic, and Japanese. Human readers and e-rater01 show the same pattern of differences for these groups, even when effects of length are controlled. Appended are: (1) TOEFL Writing Scoring Guide; and (2) Confusion Matrices for Essay Scores Combined across Mixed Cross-validation Sets for Seven Prompts. (Contains 18 tables, 3 figures, and 5 endnotes.)
Educational Testing Service. Rosedale Road Mailstop 19R, Princeton, NJ 08541-0001. Tel: 609-921-9000; Fax: 609-734-5410; Web site:
Publication Type: Numerical/Quantitative Data; Reports - Research; Tests/Questionnaires
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Grant or Contract Numbers: N/A