ERIC Number: EJ1054066
Record Type: Journal
Publication Date: 2015
Abstractor: As Provided
Contrasting State-of-the-Art in the Machine Scoring of Short-Form Constructed Responses
Shermis, Mark D.
Educational Assessment, v20 n1 p46-65 2015
This study compared short-form constructed responses evaluated by both human raters and machine scoring algorithms. The context was a public competition on which both public competitors and commercial vendors vied to develop machine scoring algorithms that would match or exceed the performance of operational human raters in a summative high-stakes testing environment. Data (N = 25,683) were drawn from three different states, employed 10 different prompts, and were drawn from two different secondary grade levels. Samples ranging in size from 2,130 to 2,999 were randomly selected from the data sets provided by the states and then randomly divided into three sets: a training set, a test set, and a validation set. Machine performance on all of the agreement measures failed to match that of the human raters. The current study concluded with recommendations on steps that might improve machine-scoring algorithms before they can be used in any operational way.
Descriptors: Test Scoring Machines, Responses, Interrater Reliability, High Stakes Tests, Summative Evaluation, Scoring, Secondary School Students, Grade 8, Grade 10
Routledge. Available from: Taylor & Francis, Ltd. 325 Chestnut Street Suite 800, Philadelphia, PA 19106. Tel: 800-354-1420; Fax: 215-625-2940; Web site: http://www.tandf.co.uk/journals
Publication Type: Journal Articles; Reports - Research
Education Level: Secondary Education; Grade 8; Junior High Schools; Middle Schools; Elementary Education; Grade 10; High Schools
Authoring Institution: N/A
Grant or Contract Numbers: N/A