ERIC Number: ED399300
Record Type: Non-Journal
Publication Date: 1996-Apr
Reference Count: N/A
Technical Issues in Large-Scale Performance Assessment.
Phillips, Gary W., Ed.
Recently, there has been a significant expansion in the use of performance assessment in large scale testing programs. Although there has been significant support from curriculum and policy stakeholders, the technical feasibility of large scale performance assessments has remained a question. This report is intended to contribute to the debate by reviewing some of the technical issues that must be addressed by any developer of large-scale performance assessments. The report is also intended to surface issues, articulate problems, and where possible, give advice on how to proceed. The report is divided into five chapters, each focusing on a major technical topic. "Validity of Performance Assessments" (Samuel Messick) defines validity as a property of inferences and interpretations made from test scores. In performance assessment, the primary adverse consequence that must be investigated is the potential negative impact on individuals or groups based on sources of invalidity. "Generalizability of Performance Assessments" (Robert L. Brennan) provides an overview of generalizability theory and integrates literature on the reliability of performance assessment with the conceptual framework of generalizability theory. "Comparability" (Edward H. Haertel and Robert L. Linn) stresses that in order to provide indicators of trends in academic achievement, large scale performance assessments must be comparable across administrations. "Setting Performance Standards for Performance Assessments: Some Fundamental Issues, Current Practice, and Technical Dilemmas" (Richard M. Jaeger, Ina V. S. Mullis, Mary Lyn Bourque, and Sharif Shakrani) describes the myriad ways performance standards are used and addresses the need for new methods of establishing such standards for performance assessments of students and teachers. Two new approaches to setting performance standards, iterative judgmental policy capturing and a multistage dominant profile procedure are outlined. References follow each of the chapters. (Contains five tables and eight figures.) (SLD)
Descriptors: Comparative Analysis, Generalizability Theory, Performance Based Assessment, Psychometrics, Standard Setting, Standard Setting (Scoring), Standards, Test Construction, Test Reliability, Test Validity, Testing Problems
U.S. Government Printing Office, Superintendent of Documents, Mail Stop: SSOP, Washington, DC 20402-9328.
Publication Type: Reports - Descriptive
Education Level: N/A
Authoring Institution: National Center for Education Statistics (ED), Washington, DC.