NotesFAQContact Us
Collection
Advanced
Search Tips
50 Years of ERIC
50 Years of ERIC
The Education Resources Information Center (ERIC) is celebrating its 50th Birthday! First opened on May 15th, 1964 ERIC continues the long tradition of ongoing innovation and enhancement.

Learn more about the history of ERIC here. PDF icon

Showing all 15 results
Peer reviewed Peer reviewed
Direct linkDirect link
Wells, Craig S.; Hambleton, Ronald K.; Kirkpatrick, Robert; Meng, Yu – Applied Measurement in Education, 2014
The purpose of the present study was to develop and evaluate two procedures flagging consequential item parameter drift (IPD) in an operational testing program. The first procedure was based on flagging items that exhibit a meaningful magnitude of IPD using a critical value that was defined to represent barely tolerable IPD. The second procedure…
Descriptors: Test Items, Test Bias, Equated Scores, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Clauser, Jerome C.; Clauser, Brian E.; Hambleton, Ronald K. – Applied Measurement in Education, 2014
The purpose of the present study was to extend past work with the Angoff method for setting standards by examining judgments at the judge level rather than the panel level. The focus was on investigating the relationship between observed Angoff standard setting judgments and empirical conditional probabilities. This relationship has been used as a…
Descriptors: Standard Setting (Scoring), Validity, Reliability, Correlation
Peer reviewed Peer reviewed
Direct linkDirect link
Wells, Craig S.; Baldwin, Su; Hambleton, Ronald K.; Sireci, Stephen G.; Karatonis, Ana; Jirka, Stephen – Applied Measurement in Education, 2009
Score equity assessment is an important analysis to ensure inferences drawn from test scores are comparable across subgroups of examinees. The purpose of the present evaluation was to assess the extent to which the Grade 8 NAEP Math and Reading assessments for 2005 were equivalent across selected states. More specifically, the present study…
Descriptors: National Competency Tests, Test Bias, Equated Scores, Grade 8
Peer reviewed Peer reviewed
Direct linkDirect link
Zenisky, April L.; Hambleton, Ronald K.; Sireci, Stephen G. – Applied Measurement in Education, 2009
How a testing agency approaches score reporting can have a significant impact on the perception of that assessment and the usefulness of the information among intended users and stakeholders. Too often, important decisions about reporting test data are left to the end of the test development cycle, but by considering the audience(s) and the kinds…
Descriptors: National Competency Tests, Scores, Test Results, Information Dissemination
Peer reviewed Peer reviewed
Direct linkDirect link
Hambleton, Ronald K.; Sireci, Stephen G.; Smith, Zachary R. – Applied Measurement in Education, 2009
In this study, we mapped achievement levels from the National Assessment of Educational Progress (NAEP) onto the score scales for selected assessments from the Trends in International Mathematics and Science Study (TIMSS) and the Program for International Student Achievement (PISA). The mapping was conducted on NAEP, TIMSS, and PISA Mathematics…
Descriptors: National Competency Tests, Mathematics Achievement, Mathematics Tests, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Jodoin, Michael G.; Zenisky, April; Hambleton, Ronald K. – Applied Measurement in Education, 2006
Many credentialing agencies today are either administering their examinations by computer or are likely to be doing so in the coming years. Unfortunately, although several promising computer-based test designs are available, little is known about how well they function in examination settings. The goal of this study was to compare fixed-length…
Descriptors: Computers, Test Results, Psychometrics, Computer Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Hambleton, Ronald K.; Xing, Dehui – Applied Measurement in Education, 2006
Now that many credentialing exams are being routinely administered by computer, new computer-based test designs, along with item response theory models, are being aggressively researched to identify specific designs that can increase the decision consistency and accuracy of pass-fail decisions. The purpose of this study was to investigate the…
Descriptors: Test Construction, Objective Tests, Item Response Theory, Feedback
Peer reviewed Peer reviewed
Direct linkDirect link
Skorupski, William P.; Hambleton, Ronald K. – Applied Measurement in Education, 2005
Panelists in an operational standard-setting study were asked to share their thoughts in written form at important points in the process itself--before the meeting started, after training, after completing the 1st and 2nd sets of ratings, immediately following the discussion between Rounds 1 and 2, and so on. The item mapping method of Plake and…
Descriptors: Grade 5, Grade 6, Language Tests, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Goodman, Dean P.; Hambleton, Ronald K. – Applied Measurement in Education, 2004
A critical, but often neglected, component of any large-scale assessment program is the reporting of test results. In the past decade, a body of evidence has been compiled that raises concerns over the ways in which these results are reported to and understood by their intended audiences. In this study, current approaches for reporting…
Descriptors: Test Results, Student Evaluation, Scores, Testing Programs
Peer reviewed Peer reviewed
Hambleton, Ronald K.; Slater, Sharon C. – Applied Measurement in Education, 1997
A brief history of developments in the assessment of the reliability of credentialing examinations is presented, and some new results are outlined that highlight the interactions among scoring, standard setting, and the reliability and validity of pass-fail decisions. Decision consistency is an important concept in evaluating credentialing…
Descriptors: Certification, Credentials, Decision Making, Interaction
Peer reviewed Peer reviewed
Hambleton, Ronald K.; Murphy, Edward – Applied Measurement in Education, 1992
The validity of several criticisms of objective tests is addressed, and the viability of some alternatives to objective testing is discussed. Evidence against multiple-choice tests is not as strong as has been claimed. Authentic assessments may not always be better, and research about new forms of assessment is necessary. (SLD)
Descriptors: Achievement Tests, Educational Assessment, Literature Reviews, Measurement Techniques
Peer reviewed Peer reviewed
Hambleton, Ronald K.; Plake, Barbara S. – Applied Measurement in Education, 1995
Several extensions to the Angoff method of standard setting are described that can accommodate characteristics of performance-based assessment. A study involving 12 panelists supported the effectiveness of the new approach but suggested that panelists preferred an approach that was at least partially conjunctive. (SLD)
Descriptors: Educational Assessment, Evaluation Methods, Evaluators, Interrater Reliability
Peer reviewed Peer reviewed
Hambleton, Ronald K.; Rogers, H. Jane – Applied Measurement in Education, 1989
Item Response Theory and Mantel-Haenszel approaches for investigating differential item performance were compared to assess the level of agreement of the approaches in identifying potentially biased items. Subjects were 2,000 White and 2,000 Native American high school students. The Mantel-Haenszel method provides an acceptable approximation of…
Descriptors: American Indians, Comparative Testing, High School Students, High Schools
Peer reviewed Peer reviewed
Hambleton, Ronald K.; Jones, Russell W. – Applied Measurement in Education, 1994
The impact of capitalizing on chance in item selection on the accuracy of test information functions was studied through simulation, focusing on examinee sample size in item calibration and the ratio of item bank size to test length. (SLD)
Descriptors: Computer Simulation, Estimation (Mathematics), Item Banks, Item Response Theory
Peer reviewed Peer reviewed
Linn, Robert L.; Hambleton, Ronald K. – Applied Measurement in Education, 1991
Four main approaches to customized testing are described, and their resulting scores' valid uses and interpretations are discussed. Customized testing can yield valid normative and curriculum-specific information, although cautious application is needed to avoid misleading inferences about student achievement. (SLD)
Descriptors: Academic Achievement, Accountability, Criterion Referenced Tests, Curriculum