NotesFAQContact Us
Collection
Advanced
Search Tips
50 Years of ERIC
50 Years of ERIC
The Education Resources Information Center (ERIC) is celebrating its 50th Birthday! First opened on May 15th, 1964 ERIC continues the long tradition of ongoing innovation and enhancement.

Learn more about the history of ERIC here. PDF icon

Showing all 10 results
Peer reviewed Peer reviewed
Direct linkDirect link
Clauser, Jerome C.; Margolis, Melissa J.; Clauser, Brian E. – Journal of Educational Measurement, 2014
Evidence of stable standard setting results over panels or occasions is an important part of the validity argument for an established cut score. Unfortunately, due to the high cost of convening multiple panels of content experts, standards often are based on the recommendation from a single panel of judges. This approach implicitly assumes that…
Descriptors: Standard Setting (Scoring), Generalizability Theory, Replication (Evaluation), Cutting Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Clauser, Brian E.; Mee, Janet; Baldwin, Su G.; Margolis, Melissa J.; Dillon, Gerard F. – Journal of Educational Measurement, 2009
Although the Angoff procedure is among the most widely used standard setting procedures for tests comprising multiple-choice items, research has shown that subject matter experts have considerable difficulty accurately making the required judgments in the absence of examinee performance data. Some authors have viewed the need to provide…
Descriptors: Standard Setting (Scoring), Program Effectiveness, Expertise, Health Personnel
Peer reviewed Peer reviewed
Direct linkDirect link
Harik, Polina; Clauser, Brian E.; Grabovsky, Irina; Nungester, Ronald J.; Swanson, Dave; Nandakumar, Ratna – Journal of Educational Measurement, 2009
The present study examined the long-term usefulness of estimated parameters used to adjust the scores from a performance assessment to account for differences in rater stringency. Ratings from four components of the USMLE[R] Step 2 Clinical Skills Examination data were analyzed. A generalizability-theory framework was used to examine the extent to…
Descriptors: Generalizability Theory, Performance Based Assessment, Performance Tests, Clinical Experience
Peer reviewed Peer reviewed
Direct linkDirect link
Clauser, Brian E.; Harik, Polina; Margolis, Melissa J. – Journal of Educational Measurement, 2006
Although multivariate generalizability theory was developed more than 30 years ago, little published research utilizing this framework exists and most of what does exist examines tests built from tables of specifications. In this context, it is assumed that the universe scores from levels of the fixed multivariate facet will be correlated, but the…
Descriptors: Multivariate Analysis, Job Skills, Correlation, Test Items
Peer reviewed Peer reviewed
Clauser, Brian E.; Harik, Polina; Clyman, Stephen G. – Journal of Educational Measurement, 2000
Used generalizability theory to assess the impact of using independent, randomly equivalent groups of experts to develop scoring algorithms for computer simulation tasks designed to measure physicians' patient management skills. Results with three groups of four medical school faculty members each suggest that the impact of the expert group may be…
Descriptors: Computer Simulation, Generalizability Theory, Performance Based Assessment, Physicians
Peer reviewed Peer reviewed
Clauser, Brian E.; Clyman, Stephen G.; Swanson, David B. – Journal of Educational Measurement, 1999
Two studies focused on aspects of the rating process in performance assessment. The first, which involved 15 raters and about 400 medical students, made the "committee" facet of raters working in groups explicit, and the second, which involved about 200 medical students and four raters, made the "rating-occasion" facet explicit. (SLD)
Descriptors: Error Patterns, Evaluation Methods, Evaluators, Higher Education
Peer reviewed Peer reviewed
Clauser, Brian E.; Margolis, Melissa J.; Clyman, Stephen G.; Ross, Linette P. – Journal of Educational Measurement, 1997
Research on automated scoring is extended by comparing alternative automated systems for scoring a computer simulation of physicians' patient management skills. A regression-based system is more highly correlated with experts' evaluations than a system that uses complex rules to map performances into score levels, but both approaches are feasible.…
Descriptors: Algorithms, Automation, Comparative Analysis, Computer Assisted Testing
Peer reviewed Peer reviewed
Clauser, Brian E.; Nungester, Ronald J.; Mazor, Kathleen; Ripkey, Douglas – Journal of Educational Measurement, 1996
Compared the results of differential item functioning (DIF) analysis with matching based on the total test score, matching based on subtest scores, or multivariate matching using multiple subtest scores. Results using 2,000 responses from medical students suggest that matching on multiple subtest scores may be superior to the other methods. (SLD)
Descriptors: Higher Education, Item Bias, Medical Education, Medical Students
Peer reviewed Peer reviewed
Clauser, Brian E.; And Others – Journal of Educational Measurement, 1996
Results with scores from 3,729 male and 2,533 female residents taking the National Board of Medical Examiners examination support the potential to improve matching in the investigation of differential item functioning (DIF) by conditioning simultaneously on test score and a categorical variable representing the examinees' educational experience.…
Descriptors: Educational Background, Educational Experience, Graduate Medical Students, Identification
Peer reviewed Peer reviewed
Clauser, Brian E.; And Others – Journal of Educational Measurement, 1995
A scoring algorithm for performance assessments is described that is based on expert judgments but requires the rating of only a sample of performances. A regression-based policy capturing procedure was implemented for clinicians evaluating skills of 280 medical students. Results demonstrate the usefulness of the algorithm. (SLD)
Descriptors: Algorithms, Clinical Diagnosis, Computer Simulation, Educational Assessment