NotesFAQContact Us
Collection
Advanced
Search Tips
50 Years of ERIC
50 Years of ERIC
The Education Resources Information Center (ERIC) is celebrating its 50th Birthday! First opened on May 15th, 1964 ERIC continues the long tradition of ongoing innovation and enhancement.

Learn more about the history of ERIC here. PDF icon

Showing 151 to 165 of 1,152 results
Peer reviewed Peer reviewed
Direct linkDirect link
Holland, Paul W.; Sinharay, Sandip; von Davier, Alina A.; Han, Ning – Journal of Educational Measurement, 2008
Two important types of observed score equating (OSE) methods for the non-equivalent groups with Anchor Test (NEAT) design are chain equating (CE) and post-stratification equating (PSE). CE and PSE reflect two distinctly different ways of using the information provided by the anchor test for computing OSE functions. Both types of methods include…
Descriptors: Equated Scores, Prediction, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip; Lu, Ying – Journal of Educational Measurement, 2008
Dodeen (2004) studied the correlation between the item parameters of the three-parameter logistic model and two item fit statistics, and found some linear relationships (e.g., a positive correlation between item discrimination parameters and item fit statistics) that have the potential for influencing the work of practitioners who employ item…
Descriptors: Correlation, Statistics, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Gierl, Mark J.; Zheng, Yinggan; Cui, Ying – Journal of Educational Measurement, 2008
The purpose of this study is to describe how the attribute hierarchy method (AHM) can be used to evaluate differential group performance at the cognitive attribute level. The AHM is a psychometric method for classifying examinees' test item responses into a set of attribute-mastery patterns associated with different components in a cognitive model…
Descriptors: Test Items, Student Reaction, Pattern Recognition, Psychometrics
Peer reviewed Peer reviewed
Direct linkDirect link
Van Nijlen, Daniel; Janssen, Rianne – Journal of Educational Measurement, 2008
Essential for the validity of the judgments in a standard-setting study is that they follow the implicit task assumptions. In the Angoff method, judgments are assumed to be inversely related to the difficulty of the items; contrasting-groups judgments are assumed to be positively related to the ability of the students. In the present study,…
Descriptors: Standard Setting (Scoring), Validity, Regression (Statistics)
Peer reviewed Peer reviewed
Direct linkDirect link
Penfield, Randall D. – Journal of Educational Measurement, 2008
Investigations of differential distractor functioning (DDF) can provide valuable information concerning the location and possible causes of measurement invariance within a multiple-choice item. In this article, I propose an odds ratio estimator of the DDF effect as modeled under the nominal response model. In addition, I propose a simultaneous…
Descriptors: Test Items, Investigations, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Finch, Holmes – Journal of Educational Measurement, 2008
Missing data are a common problem in a variety of measurement settings, including responses to items on both cognitive and affective assessments. Researchers have shown that such missing data may create problems in the estimation of item difficulty parameters in the Item Response Theory (IRT) context, particularly if they are ignored. At the same…
Descriptors: Simulation, Item Response Theory, Researchers, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Pommerich, Mary; Segall, Daniel O. – Journal of Educational Measurement, 2008
The accuracy of CAT scores can be negatively affected by local dependence if the CAT utilizes parameters that are misspecified due to the presence of local dependence and/or fails to control for local dependence in responses during the administration stage. This article evaluates the existence and effect of local dependence in a test of…
Descriptors: Simulation, Computer Assisted Testing, Mathematics Tests, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Klockars, Alan J.; Lee, Yoonsun – Journal of Educational Measurement, 2008
Monte Carlo simulations with 20,000 replications are reported to estimate the probability of rejecting the null hypothesis regarding DIF using SIBTEST when there is DIF present and/or when impact is present due to differences on the primary dimension to be measured. Sample sizes are varied from 250 to 2000 and test lengths from 10 to 40 items.…
Descriptors: Test Bias, Test Length, Reference Groups, Probability
Peer reviewed Peer reviewed
Direct linkDirect link
Davis, Susan L.; Buckendahl, Chad W.; Plake, Barbara S. – Journal of Educational Measurement, 2008
As an alternative to adaptation, tests may also be developed simultaneously in multiple languages. Although the items on such tests could vary substantially, scores from these tests may be used to make the same types of decisions about different groups of examinees. The ability to make such decisions is contingent upon setting performance…
Descriptors: Test Results, Testing Programs, Multilingualism, Standard Setting
Peer reviewed Peer reviewed
Direct linkDirect link
Liu, Jinghua; Low, Albert C. – Journal of Educational Measurement, 2008
This study applied kernel equating (KE) in two scenarios: equating to a very similar population and equating to a very different population, referred to as a distant population, using SAT[R] data. The KE results were compared to the results obtained from analogous traditional equating methods in both scenarios. The results indicate that KE results…
Descriptors: Equated Scores, Population Groups, Differences, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
de la Torre, Jimmy – Journal of Educational Measurement, 2008
Most model fit analyses in cognitive diagnosis assume that a Q matrix is correct after it has been constructed, without verifying its appropriateness. Consequently, any model misfit attributable to the Q matrix cannot be addressed and remedied. To address this concern, this paper proposes an empirically based method of validating a Q matrix used…
Descriptors: Matrices, Validity, Models, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Kang, Taehoon; Chen, Troy T. – Journal of Educational Measurement, 2008
Orlando and Thissen's S-X[superscript 2] item fit index has performed better than traditional item fit statistics such as Yen' s Q[subscript 1] and McKinley and Mill' s G[superscript 2] for dichotomous item response theory (IRT) models. This study extends the utility of S-X[superscript 2] to polytomous IRT models, including the generalized partial…
Descriptors: Item Response Theory, Models, Rating Scales, Generalization
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Sooyeon; von Davier, Alina A.; Haberman, Shelby – Journal of Educational Measurement, 2008
This study addressed the sampling error and linking bias that occur with small samples in a nonequivalent groups anchor test design. We proposed a linking method called the synthetic function, which is a weighted average of the identity function and a traditional equating function (in this case, the chained linear equating function). Specifically,…
Descriptors: Equated Scores, Sample Size, Test Reliability, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Camilli, Gregory; Prowker, Adam; Dossey, John A.; Lindquist, Mary M.; Chiu, Ting-Wei; Vargas, Sadako; de la Torre, Jimmy – Journal of Educational Measurement, 2008
A new method for analyzing differential item functioning is proposed to investigate the relative strengths and weaknesses of multiple groups of examinees. Accordingly, the notion of a conditional measure of difference between two groups (Reference and Focal) is generalized to a conditional variance. The objective of this article is to present and…
Descriptors: Test Bias, National Competency Tests, Grade 4, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Seonghoon; Feldt, Leonard S. – Journal of Educational Measurement, 2008
This article extends the Bonett (2003a) approach to testing the equality of alpha coefficients from two independent samples to the case of m [greater than or equal] 2 independent samples. The extended Fisher-Bonett test and its competitor, the Hakstian-Whalen (1976) test, are illustrated with numerical examples of both hypothesis testing and power…
Descriptors: Tests, Comparative Analysis, Hypothesis Testing, Error of Measurement
Pages: 1  |  ...  |  7  |  8  |  9  |  10  |  11  |  12  |  13  |  14  |  15  |  ...  |  77