Publication Date
| In 2015 | 0 |
| Since 2014 | 1 |
| Since 2011 (last 5 years) | 9 |
| Since 2006 (last 10 years) | 27 |
| Since 1996 (last 20 years) | 52 |
Descriptor
| Item Response Theory | 21 |
| Test Items | 19 |
| Models | 13 |
| Scores | 13 |
| Test Construction | 13 |
| Psychometrics | 8 |
| Test Bias | 8 |
| Computer Assisted Testing | 6 |
| Measurement Techniques | 6 |
| Equated Scores | 5 |
| More ▼ | |
Source
| Journal of Educational… | 59 |
Author
| van der Linden, Wim J. | 4 |
| Kolen, Michael J. | 3 |
| Sinharay, Sandip | 3 |
| de la Torre, Jimmy | 3 |
| Camilli, Gregory | 2 |
| Dorans, Neil J. | 2 |
| Holland, Paul W. | 2 |
| Kane, Michael | 2 |
| Kane, Michael T. | 2 |
| Lee, Won-Chan | 2 |
| More ▼ | |
Publication Type
| Journal Articles | 59 |
| Reports - Descriptive | 59 |
| Speeches/Meeting Papers | 6 |
| Reports - Evaluative | 2 |
| Guides - Non-Classroom | 1 |
| Information Analyses | 1 |
| Opinion Papers | 1 |
| Reports - Research | 1 |
Education Level
| Grade 4 | 1 |
| High Schools | 1 |
| Higher Education | 1 |
| Middle Schools | 1 |
Audience
Showing 1 to 15 of 59 results
Guo, Hongwen; Puhan, Gautam – Journal of Educational Measurement, 2014
In this article, we introduce a section preequating (SPE) method (linear and nonlinear) under the randomly equivalent groups design. In this equating design, sections of Test X (a future new form) and another existing Test Y (an old form already on scale) are administered. The sections of Test X are equated to Test Y, after adjusting for the…
Descriptors: Equated Scores, Correlation, Simulation, Testing
Kim, Seonghoon – Journal of Educational Measurement, 2013
With known item response theory (IRT) item parameters, Lord and Wingersky provided a recursive algorithm for computing the conditional frequency distribution of number-correct test scores, given proficiency. This article presents a generalized algorithm for computing the conditional distribution of summed test scores involving real-number item…
Descriptors: Item Response Theory, Scores, Computation, Mathematics
Paek, Insu – Journal of Educational Measurement, 2012
Although logistic regression became one of the well-known methods in detecting differential item functioning (DIF), its three statistical tests, the Wald, likelihood ratio (LR), and score tests, which are readily available under the maximum likelihood, do not seem to be consistently distinguished in DIF literature. This paper provides a clarifying…
Descriptors: Test Bias, Tests, Maximum Likelihood Statistics, Statistical Analysis
Mislevy, Robert J.; Zwick, Rebecca – Journal of Educational Measurement, 2012
A new entry in the testing lexicon is through-course summative assessment, a system consisting of components administered periodically during the academic year. As defined in the Race to the Top program, these assessments are intended to yield a yearly summative score for accountability purposes. They must provide for both individual and group…
Descriptors: National Competency Tests, Inferences, Item Response Theory, Summative Evaluation
Kane, Michael – Journal of Educational Measurement, 2011
Errors don't exist in our data, but they serve a vital function. Reality is complicated, but our models need to be simple in order to be manageable. We assume that attributes are invariant over some conditions of observation, and once we do that we need some way of accounting for the variability in observed scores over these conditions of…
Descriptors: Error of Measurement, Scores, Test Interpretation, Testing
Baldwin, Peter – Journal of Educational Measurement, 2011
Growing interest in fully Bayesian item response models begs the question: To what extent can model parameter posterior draws enhance existing practices? One practice that has traditionally relied on model parameter point estimates but may be improved by using posterior draws is the development of a common metric for two independently calibrated…
Descriptors: Item Response Theory, Bayesian Statistics, Computation, Sampling
van der Linden, Wim J. – Journal of Educational Measurement, 2011
A critical component of test speededness is the distribution of the test taker's total time on the test. A simple set of constraints on the item parameters in the lognormal model for response times is derived that can be used to control the distribution when assembling a new test form. As the constraints are linear in the item parameters, they can…
Descriptors: Test Format, Reaction Time, Test Construction
van der Linden, Wim J.; Diao, Qi – Journal of Educational Measurement, 2011
In automated test assembly (ATA), the methodology of mixed-integer programming is used to select test items from an item bank to meet the specifications for a desired test form and optimize its measurement accuracy. The same methodology can be used to automate the formatting of the set of selected items into the actual test form. Three different…
Descriptors: Test Items, Test Format, Test Construction, Item Banks
Sinharay, Sandip; Haberman, Shelby J. – Journal of Educational Measurement, 2011
Recently, there has been an increasing level of interest in subscores for their potential diagnostic value. Haberman (2008b) suggested reporting an augmented subscore that is a linear combination of a subscore and the total score. Sinharay and Haberman (2008) and Sinharay (2010) showed that augmented subscores often lead to more accurate…
Descriptors: Diagnostic Tests, Psychometrics, Testing, Equated Scores
Frederickx, Sofie; Tuerlinckx, Francis; De Boeck, Paul; Magis, David – Journal of Educational Measurement, 2010
In this paper we present a new methodology for detecting differential item functioning (DIF). We introduce a DIF model, called the random item mixture (RIM), that is based on a Rasch model with random item difficulties (besides the common random person abilities). In addition, a mixture model is assumed for the item difficulties such that the…
Descriptors: Test Bias, Models, Test Items, Difficulty Level
de la Torre, Jimmy; Lee, Young-Sun – Journal of Educational Measurement, 2010
Cognitive diagnosis models (CDMs), as alternative approaches to unidimensional item response models, have received increasing attention in recent years. CDMs are developed for the purpose of identifying the mastery or nonmastery of multiple fine-grained attributes or skills required for solving problems in a domain. For CDMs to receive wider use,…
Descriptors: Ability Grouping, Item Response Theory, Models, Problem Solving
Lee, Won-Chan – Journal of Educational Measurement, 2010
In this article, procedures are described for estimating single-administration classification consistency and accuracy indices for complex assessments using item response theory (IRT). This IRT approach was applied to real test data comprising dichotomous and polytomous items. Several different IRT model combinations were considered. Comparisons…
Descriptors: Classification, Item Response Theory, Comparative Analysis, Models
van der Linden, Wim J. – Journal of Educational Measurement, 2010
Although response times on test items are recorded on a natural scale, the scale for some of the parameters in the lognormal response-time model (van der Linden, 2006) is not fixed. As a result, when the model is used to periodically calibrate new items in a testing program, the parameter are not automatically mapped onto a common scale. Several…
Descriptors: Test Items, Testing Programs, Measures (Individuals), Item Response Theory
Finkelman, Matthew; Kim, Wonsuk; Roussos, Louis A. – Journal of Educational Measurement, 2009
Much recent psychometric literature has focused on cognitive diagnosis models (CDMs), a promising class of instruments used to measure the strengths and weaknesses of examinees. This article introduces a genetic algorithm to perform automated test assembly alongside CDMs. The algorithm is flexible in that it can be applied whether the goal is to…
Descriptors: Identification, Genetics, Test Construction, Mathematics
Livingston, Samuel A.; Kim, Sooyeon – Journal of Educational Measurement, 2009
This article suggests a method for estimating a test-score equating relationship from small samples of test takers. The method does not require the estimated equating transformation to be linear. Instead, it constrains the estimated equating curve to pass through two pre-specified end points and a middle point determined from the data. In a…
Descriptors: Measurement, Measurement Techniques, Psychometrics, Sample Size

Peer reviewed
Direct link
