ERIC Number: ED464964
Record Type: Non-Journal
Publication Date: 2002-Apr
Reference Count: N/A
Score Reliability as an Essential Prerequisite for Validating New Writing and Speaking Tasks for TOEFL.
Lee, Yong-Won; Kantor, Robert; Mollaun, Pam
This paper reports the results of generalizability theory (G) analyses done for new writing and speaking tasks for the Test of English as a Foreign Language (TOEFL). For writing, a special focus was placed on evaluating the impact on the reliability of the number of raters (or ratings) per essay (one or two) and the number of tasks (one, two, or three). For speaking, the focus was on investigating the impact of the number of tasks (1 though 12) and the number of ratings (1 or 2). Data for the writing study were 488 examinees' responses to 2 reading-writing and 3 listening-writing TOEFL tasks. Data for the speaking study were from 6 listening-speaking, 2 reading-speaking, and 5 independent speaking tasks administered to 261 examinees. By far the greatest source of variation in examinees' test performance was due to differences among test takers' ability as measured by the writing and speaking tasks. This suggests that, as intended, the tasks do distinguish among examinees. Results further suggest that, to maximize score reliability for both speaking and writing, it would be more cost efficient to increase the number of tasks rather than the number of ratings per task. Beyond five or six tasks, there would be a diminishing return for increasing the number of tasks. (Contains 2 tables, 3 figures, and 10 references.) (SLD)
Publication Type: Numerical/Quantitative Data; Reports - Research; Speeches/Meeting Papers
Education Level: N/A
Authoring Institution: Educational Testing Service, Princeton, NJ.
Identifiers - Assessments and Surveys: Test of English as a Foreign Language