Publication Date
| In 2015 | 1 |
| Since 2014 | 2 |
| Since 2011 (last 5 years) | 2 |
| Since 2006 (last 10 years) | 7 |
| Since 1996 (last 20 years) | 11 |
Descriptor
| Testing Problems | 92 |
| Test Validity | 18 |
| Test Interpretation | 17 |
| Test Items | 17 |
| Achievement Tests | 15 |
| Mathematical Models | 13 |
| Scores | 13 |
| Test Construction | 13 |
| Test Reliability | 13 |
| Measurement Techniques | 12 |
| More ▼ | |
Author
| Linn, Robert L. | 4 |
| Wainer, Howard | 4 |
| Budescu, David | 2 |
| Choi, Seung W. | 2 |
| Fitzpatrick, Anne R. | 2 |
| Hoover, H. D. | 2 |
| Hughes, David C. | 2 |
| Kim, Dong-In | 2 |
| Rowley, Glenn L. | 2 |
| Secolsky, Charles | 2 |
| More ▼ | |
Publication Type
| Journal Articles | 68 |
| Reports - Research | 32 |
| Reports - Evaluative | 16 |
| Book/Product Reviews | 8 |
| Opinion Papers | 7 |
| Information Analyses | 2 |
| Speeches/Meeting Papers | 2 |
| Reports - Descriptive | 1 |
Education Level
| Elementary Secondary Education | 1 |
| Secondary Education | 1 |
Audience
| Researchers | 3 |
| Practitioners | 1 |
Showing 1 to 15 of 92 results
Sinharay, Sandip; Wan, Ping; Choi, Seung W.; Kim, Dong-In – Journal of Educational Measurement, 2015
With an increase in the number of online tests, the number of interruptions during testing due to unexpected technical issues seems to be on the rise. For example, interruptions occurred during several recent state tests. When interruptions occur, it is important to determine the extent of their impact on the examinees' scores. Researchers…
Descriptors: Computer Assisted Testing, Testing Problems, Scores, Statistical Analysis
Sinharay, Sandip; Wan, Ping; Whitaker, Mike; Kim, Dong-In; Zhang, Litong; Choi, Seung W. – Journal of Educational Measurement, 2014
With an increase in the number of online tests, interruptions during testing due to unexpected technical issues seem unavoidable. For example, interruptions occurred during several recent state tests. When interruptions occur, it is important to determine the extent of their impact on the examinees' scores. There is a lack of research on this…
Descriptors: Computer Assisted Testing, Testing Problems, Scores, Regression (Statistics)
de La Torre, Jimmy; Karelitz, Tzur M. – Journal of Educational Measurement, 2009
Compared to unidimensional item response models (IRMs), cognitive diagnostic models (CDMs) based on latent classes represent examinees' knowledge and item requirements using discrete structures. This study systematically examines the viability of retrofitting CDMs to IRM-based data with a linear attribute structure. The study utilizes a procedure…
Descriptors: Simulation, Item Response Theory, Psychometrics, Evaluation Methods
Clauser, Brian E.; Mee, Janet; Baldwin, Su G.; Margolis, Melissa J.; Dillon, Gerard F. – Journal of Educational Measurement, 2009
Although the Angoff procedure is among the most widely used standard setting procedures for tests comprising multiple-choice items, research has shown that subject matter experts have considerable difficulty accurately making the required judgments in the absence of examinee performance data. Some authors have viewed the need to provide…
Descriptors: Standard Setting (Scoring), Program Effectiveness, Expertise, Health Personnel
Myford, Carol M.; Wolfe, Edward W. – Journal of Educational Measurement, 2009
In this study, we describe a framework for monitoring rater performance over time. We present several statistical indices to identify raters whose standards drift and explain how to use those indices operationally. To illustrate the use of the framework, we analyzed rating data from the 2002 Advanced Placement English Literature and Composition…
Descriptors: English Literature, Advanced Placement, Measures (Individuals), Writing (Composition)
Cui, Ying; Leighton, Jacqueline P. – Journal of Educational Measurement, 2009
In this article, we introduce a person-fit statistic called the hierarchy consistency index (HCI) to help detect misfitting item response vectors for tests developed and analyzed based on a cognitive model. The HCI ranges from -1.0 to 1.0, with values close to -1.0 indicating that students respond unexpectedly or differently from the responses…
Descriptors: Test Length, Simulation, Correlation, Research Methodology
van der Linden, Wim J.; Breithaupt, Krista; Chuah, Siang Chee; Zhang, Yanwei – Journal of Educational Measurement, 2007
A potential undesirable effect of multistage testing is differential speededness, which happens if some of the test takers run out of time because they receive subtests with items that are more time intensive than others. This article shows how a probabilistic response-time model can be used for estimating differences in time intensities and speed…
Descriptors: Adaptive Testing, Evaluation Methods, Test Items, Reaction Time
Peer reviewedSotaridona, Leonardo S.; Meijer, Rob R. – Journal of Educational Measurement, 2003
Proposed two new indices to detect answer copying on a multiple choice test and conducted a simulation study to investigate the usefulness of both indexes. Discusses conditions under which the proposed indexes can be useful. (SLD)
Descriptors: Cheating, Multiple Choice Tests, Simulation, Testing Problems
Peer reviewedMoon, Tonya R. – Journal of Educational Measurement, 1997
This book is a collection of nine commissioned papers that consider the educational reform represented by performance assessments, their advantages and disadvantages, and the challenges of successful implementation. Both theoretical and practical issues are addressed. (SLD)
Descriptors: Educational Assessment, Educational Change, Educational Theories, Performance Based Assessment
Peer reviewedSykes, Robert C.; Ito, Kyoko; Fitzpatrick, Anne R.; Ercikan, Kadriye – Journal of Educational Measurement, 1997
The five chapters of this report provide resources that deal with the validity, generalizability, comparability, performance standards, and fairness, equity, and bias of performance assessments. The book is written for experienced educational measurement practitioners, although an extensive familiarity with performance assessment is not required.…
Descriptors: Educational Assessment, Measurement Techniques, Performance Based Assessment, Standards
Peer reviewedWaltman, Kristie K. – Journal of Educational Measurement, 1996
"Measuring Up" describes the current standards-based educational reform movement for a general audience and professionals who have not been involved in reform. The simplified perspective does not do justice to the complexity of reform issues in some contexts, but the case studies illustrating reform efforts are particularly useful. (SLD)
Descriptors: Academic Achievement, Case Studies, Educational Assessment, Educational Change
Peer reviewedFeldt, Leonard S.; Forsyth, Robert A. – Journal of Educational Measurement, 1974
The net effect of the conditions under which tests are taken was empirically investigated using the scores obtained by high school students on an English and a mathematics test. (Author/BB)
Descriptors: Achievement Tests, Context Effect, English, Item Sampling
Peer reviewedGlass, Gene V. – Journal of Educational Measurement, 1978
A detailed analysis of standard setting and criteria for test scores and educational decisions is presented. The author contends that present procedures are in need of re-examination. (JKS)
Descriptors: Academic Standards, Behavioral Objectives, Criterion Referenced Tests, Decision Making
Peer reviewedBurton, Nancy W. – Journal of Educational Measurement, 1978
Three methods for setting performance standards for criterion referenced tests--theories, expert judgments, and practical necessity--are reviewed. The author concludes that the use of performance standards is not a promising vehicle for making societal decisions. (Author/JKS)
Descriptors: Academic Standards, Criterion Referenced Tests, Cutting Scores, Decision Making
Peer reviewedScriven, Michael – Journal of Educational Measurement, 1978
The utility of setting standards for educational decisions, even though those standards may be somewhat arbitrary, is defended in this response to Glass's article (TM 504 031). (JKS)
Descriptors: Academic Standards, Criterion Referenced Tests, Cutting Scores, Decision Making

Direct link
