Publication Date
| In 2015 | 0 |
| Since 2014 | 1 |
| Since 2011 (last 5 years) | 2 |
| Since 2006 (last 10 years) | 10 |
| Since 1996 (last 20 years) | 18 |
Descriptor
| Item Analysis | 107 |
| Test Items | 46 |
| Test Construction | 29 |
| Test Validity | 24 |
| Test Reliability | 21 |
| Test Bias | 18 |
| Latent Trait Theory | 17 |
| Comparative Analysis | 14 |
| College Entrance Examinations | 13 |
| Higher Education | 13 |
| More ▼ | |
Author
| Dorans, Neil J. | 3 |
| Bennett, Randy Elliot | 2 |
| Hoover, H. D. | 2 |
| Huck, Schuyler W. | 2 |
| Mehrens, William A. | 2 |
| Miller, M. David | 2 |
| Phillips, S. E. | 2 |
| Plake, Barbara S. | 2 |
| Roussos, Louis A. | 2 |
| Rudner, Lawrence M. | 2 |
| More ▼ | |
Publication Type
| Journal Articles | 72 |
| Reports - Research | 54 |
| Reports - Evaluative | 15 |
| Reports - Descriptive | 2 |
| Guides - Non-Classroom | 1 |
| Information Analyses | 1 |
| Speeches/Meeting Papers | 1 |
Education Level
| Elementary Secondary Education | 1 |
| Higher Education | 1 |
| Postsecondary Education | 1 |
Audience
| Researchers | 2 |
Showing 1 to 15 of 107 results
Zu, Jiyun; Puhan, Gautam – Journal of Educational Measurement, 2014
Preequating is in demand because it reduces score reporting time. In this article, we evaluated an observed-score preequating method: the empirical item characteristic curve (EICC) method, which makes preequating without item response theory (IRT) possible. EICC preequating results were compared with a criterion equating and with IRT true-score…
Descriptors: Item Response Theory, Equated Scores, Item Analysis, Item Sampling
Albano, Anthony D. – Journal of Educational Measurement, 2013
In many testing programs it is assumed that the context or position in which an item is administered does not have a differential effect on examinee responses to the item. Violations of this assumption may bias item response theory estimates of item and person parameters. This study examines the potentially biasing effects of item position. A…
Descriptors: Test Items, Item Response Theory, Test Format, Questioning Techniques
Kim, Sooyeon; Walker, Michael E.; McHale, Frederick – Journal of Educational Measurement, 2010
In this study we examined variations of the nonequivalent groups equating design for tests containing both multiple-choice (MC) and constructed-response (CR) items to determine which design was most effective in producing equivalent scores across the two tests to be equated. Using data from a large-scale exam, this study investigated the use of…
Descriptors: Measures (Individuals), Scoring, Equated Scores, Test Bias
Finkelman, Matthew; Nering, Michael L.; Roussos, Louis A. – Journal of Educational Measurement, 2009
In computerized adaptive testing (CAT), ensuring the security of test items is a crucial practical consideration. A common approach to reducing item theft is to define maximum item exposure rates, i.e., to limit the proportion of examinees to whom a given item can be administered. Numerous methods for controlling exposure rates have been proposed…
Descriptors: Test Items, Adaptive Testing, Item Analysis, Item Response Theory
Moses, Tim; Yang, Wen-Ling; Wilson, Christine – Journal of Educational Measurement, 2007
This study explored the use of kernel equating for integrating and extending two procedures proposed for assessing item order effects in test forms that have been administered to randomly equivalent groups. When these procedures are used together, they can provide complementary information about the extent to which item order effects impact test…
Descriptors: Advanced Placement, Equated Scores, Test Items, Item Analysis
DeMars, Christine E. – Journal of Educational Measurement, 2006
Four item response theory (IRT) models were compared using data from tests where multiple items were grouped into testlets focused on a common stimulus. In the bi-factor model each item was treated as a function of a primary trait plus a nuisance trait due to the testlet; in the testlet-effects model the slopes in the direction of the testlet…
Descriptors: Item Response Theory, Reliability, Item Analysis, Factor Analysis
Wang, Wen-Chung; Wilson, Mark; Shih, Ching-Lin – Journal of Educational Measurement, 2006
This study presents the random-effects rating scale model (RE-RSM) which takes into account randomness in the thresholds over persons by treating them as random-effects and adding a random variable for each threshold in the rating scale model (RSM) (Andrich, 1978). The RE-RSM turns out to be a special case of the multidimensional random…
Descriptors: Item Analysis, Rating Scales, Item Response Theory, Monte Carlo Methods
Gierl, Mark J.; Leighton, Jacqueline P.; Tan, Xuan – Journal of Educational Measurement, 2006
DETECT, the acronym for Dimensionality Evaluation To Enumerate Contributing Traits, is an innovative and relatively new nonparametric dimensionality assessment procedure used to identify mutually exclusive, dimensionally homogeneous clusters of items using a genetic algorithm ( Zhang & Stout, 1999). Because the clusters of items are mutually…
Descriptors: Program Evaluation, Cluster Grouping, Evaluation Methods, Multivariate Analysis
Roussos, Louis A.; Ozbek, Ozlem Yesim – Journal of Educational Measurement, 2006
The development of the DETECT procedure marked an important advancement in nonparametric dimensionality analysis. DETECT is the first nonparametric technique to estimate the number of dimensions in a data set, estimate an effect size for multidimensionality, and identify which dimension is predominantly measured by each item. The efficacy of…
Descriptors: Evaluation Methods, Effect Size, Test Bias, Item Response Theory
Kim, Jee-Seon – Journal of Educational Measurement, 2006
Simulation and real data studies are used to investigate the value of modeling multiple-choice distractors on item response theory linking. Using the characteristic curve linking procedure for Bock's (1972) nominal response model presented by Kim and Hanson (2002), all-category linking (i.e., a linking based on all category characteristic curves…
Descriptors: Multiple Choice Tests, Test Items, Item Response Theory, Simulation
A Closer Look at Using Judgments of Item Difficulty to Change Answers on Computerized Adaptive Tests
Vispoel, Walter P.; Clough, Sara J.; Bleiler, Timothy – Journal of Educational Measurement, 2005
Recent studies have shown that restricting review and answer change opportunities on computerized adaptive tests (CATs) to items within successive blocks reduces time spent in review, satisfies most examinees' desires for review, and controls against distortion in proficiency estimates resulting from intentional incorrect answering of items prior…
Descriptors: Mathematics, Item Analysis, Adaptive Testing, Computer Assisted Testing
van der Linden, Wim J. – Journal of Educational Measurement, 2005
In test assembly, a fundamental difference exists between algorithms that select a test sequentially or simultaneously. Sequential assembly allows us to optimize an objective function at the examinee's ability estimate, such as the test information function in computerized adaptive testing. But it leads to the non-trivial problem of how to realize…
Descriptors: Law Schools, Item Analysis, Admission (School), Adaptive Testing
Chen, Shu-Ying; Ankenman, Robert D. – Journal of Educational Measurement, 2004
The purpose of this study was to compare the effects of four item selection rules--(1) Fisher information (F), (2) Fisher information with a posterior distribution (FP), (3) Kullback-Leibler information with a posterior distribution (KP), and (4) completely randomized item selection (RN)--with respect to the precision of trait estimation and the…
Descriptors: Test Length, Adaptive Testing, Computer Assisted Testing, Test Selection
Schulz, E. Matthew; Betebenner, Damian; Ahn, Meeyeon – Journal of Educational Measurement, 2004
Whether hierarchical logistic regression can reduce the sample size requirement for estimating optimal cutoff scores in a course placement service where predictive validity is measured by a threshold utility function is explored. Data from courses with varying class size were randomly partitioned into two halves per course. Nonhierarchical and…
Descriptors: Class Size, Sample Size, Cutting Scores, Predictive Validity
Meyer, J. Patrick; Huynh, Huynh; Seaman, Michael A. – Journal of Educational Measurement, 2004
Exact nonparametric procedures have been used to identify the level of differential item functioning (DIF) in binary items. This study explored the use of exact DIF procedures with items scored on a Likert scale. The results from an attitude survey suggest that the large-sample Cochran-Mantel-Haenszel (CMH) procedure identifies more items as…
Descriptors: Test Bias, Attitude Measures, Surveys, Predictive Validity

Peer reviewed
Direct link
