NotesFAQContact Us
Search Tips
Peer reviewed Peer reviewed
PDF on ERIC Download full text
ERIC Number: EJ1109838
Record Type: Journal
Publication Date: 2012-Mar
Pages: 58
Abstractor: As Provided
ISSN: EISSN-2330-8516
Evaluation of the "e-rater"® Scoring Engine for the "TOEFL"® Independent and Integrated Prompts. Research Report. ETS RR-12-06
Ramineni, Chaitanya; Trapani, Catherine S.; Williamson, David M.; Davey, Tim; Bridgeman, Brent
ETS Research Report Series, Mar 2012
Scoring models for the "e-rater"® system were built and evaluated for the "TOEFL"® exam's independent and integrated writing prompts. Prompt-specific and generic scoring models were built, and evaluation statistics, such as weighted kappas, Pearson correlations, standardized differences in mean scores, and correlations with external measures, were examined to evaluate the e-rater model performance against human scores. Performance was also evaluated across different demographic subgroups. Additional analyses were performed to establish appropriate agreement thresholds between human and e-rater scores for unusual essays and the impact of using e-rater on operational scores. Generic e-rater scoring models were recommended for operational use for both independent and integrated writing tasks. The two automated scoring models were recommended for operational use to produce contributory scores within a discrepancy threshold of 1.5 and 1.0 with a human score for independent and integrated prompts respectively.
Educational Testing Service. Rosedale Road, MS19-R Princeton, NJ 08541. Tel: 609-921-9000; Fax: 609-734-5410; e-mail:; Web site:
Publication Type: Journal Articles; Reports - Research
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Identifiers - Assessments and Surveys: Test of English as a Foreign Language
Grant or Contract Numbers: N/A