NotesFAQContact Us
Search Tips
ERIC Number: ED310149
Record Type: Non-Journal
Publication Date: 1988-Jun
Pages: 18
Abstractor: N/A
Reference Count: N/A
Total Score Reliability in Large-Scale Writing Assessment.
Bunch, Michael B.; Littlefair, Wendy
A total of 2,000 essays written by 1,000 students was submitted to generalizability analyses for domain-referenced tests. Each student had written one essay on each of two prompts representing two models of discourse. Each essay was read by six readers and judged on a scale of from 1 to 4. No reader read essays from both prompts. Reader agreement rates and interrater reliability coefficients were computed. Extensive analyses were conducted using GENOVA, a generalizability analysis program. Special consideration was given to the universes of generalizability for readers and prompts. One set of artificially unreliable scores was introduced to increase variance due to readers and the reader-times-essay interaction and, thus, lower reliability. Results indicate that the score reliability of essay tests is multifaceted and can be estimated in a variety of ways depending on the purpose of the assessment and the intended use of results. When pass-fail decisions or determinations of absolute skill levels are to be made, indices that take into account the cut point or points are needed. Seven data tables and two graphs are provided. (TJH)
Publication Type: Reports - Evaluative; Speeches/Meeting Papers
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A