ERIC Number: EJ1111570
Record Type: Journal
Publication Date: 2007-May
Pages: 26
Abstractor: As Provided
ISBN: N/A
ISSN: EISSN-2330-8516
EISSN: N/A
Construct Validity of "e-rater"® in Scoring TOEFL® Essays. Research Report. ETS RR-07-21
Attali, Yigal
ETS Research Report Series, May 2007
This study examined the construct validity of the "e-rater"® automated essay scoring engine as an alternative to human scoring in the context of TOEFL® essay writing. Analyses were based on a sample of students who repeated the TOEFL within a short time period. Two "e-rater" scores were investigated in this study, the first based on optimally predicting the human essay score and the second based on equal weights for the different features of "e-rater." Within a multitrait-multimethod approach, the correlations and reliabilities of human and "e-rater" scores were analyzed together with TOEFL subscores (structured writing, reading, and listening) and with essay length. Possible biases between human and "e-rater" scores were examined with respect to differences in performance across countries of origin and differences in difficulty across prompts. Finally, a factor analysis was conducted on the "e-rater" features to investigate the interpretability of their internal structure and determine which of the two "e-rater" scores reflects this structure more closely. Results showed that the "e-rater" score based on optimally predicting the human score measures essentially the same construct as human-based essay scores with significantly higher reliability and consequently higher correlations with related language scores. The equal-weights "e-rater" score showed the same high reliability but significantly lower correlation with essay length. It is also aligned with the 3-factor hierarchical (word use, grammar, and discourse) structure that was discovered in the factor analysis. Both "e-rater" scores also successfully replicate human score differences between countries and prompts.
Descriptors: Construct Validity, Computer Assisted Testing, Scoring, English (Second Language), Language Tests, Second Language Learning, Essay Tests, Correlation, Test Reliability, Bias, Factor Analysis, Prediction, Comparative Analysis, Regression (Statistics), Models, Weighted Scores, True Scores, Comparative Education, Prompting
Educational Testing Service. Rosedale Road, MS19-R Princeton, NJ 08541. Tel: 609-921-9000; Fax: 609-734-5410; e-mail: RDweb@ets.org; Web site: https://www.ets.org/research/policy_research_reports/ets
Publication Type: Journal Articles; Reports - Research
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Identifiers - Assessments and Surveys: Test of English as a Foreign Language
Grant or Contract Numbers: N/A