NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 1 to 15 of 55 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Cousineau, Denis; Laurencelle, Louis – Educational and Psychological Measurement, 2017
Assessing global interrater agreement is difficult as most published indices are affected by the presence of mixtures of agreements and disagreements. A previously proposed method was shown to be specifically sensitive to global agreement, excluding mixtures, but also negatively biased. Here, we propose two alternatives in an attempt to find what…
Descriptors: Interrater Reliability, Evaluation Methods, Statistical Bias, Accuracy
Peer reviewed Peer reviewed
Direct linkDirect link
Raykov, Tenko; Marcoulides, George A.; Li, Tenglong – Educational and Psychological Measurement, 2017
The measurement error in principal components extracted from a set of fallible measures is discussed and evaluated. It is shown that as long as one or more measures in a given set of observed variables contains error of measurement, so also does any principal component obtained from the set. The error variance in any principal component is shown…
Descriptors: Error of Measurement, Factor Analysis, Research Methodology, Psychometrics
Peer reviewed Peer reviewed
Direct linkDirect link
Devlieger, Ines; Mayer, Axel; Rosseel, Yves – Educational and Psychological Measurement, 2016
In this article, an overview is given of four methods to perform factor score regression (FSR), namely regression FSR, Bartlett FSR, the bias avoiding method of Skrondal and Laake, and the bias correcting method of Croon. The bias correcting method is extended to include a reliable standard error. The four methods are compared with each other and…
Descriptors: Regression (Statistics), Comparative Analysis, Structural Equation Models, Monte Carlo Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2016
The frequently neglected and often misunderstood relationship between classical test theory and item response theory is discussed for the unidimensional case with binary measures and no guessing. It is pointed out that popular item response models can be directly obtained from classical test theory-based models by accounting for the discrete…
Descriptors: Test Theory, Item Response Theory, Models, Correlation
Peer reviewed Peer reviewed
Direct linkDirect link
Liu, Yang; Maydeu-Olivares, Alberto – Educational and Psychological Measurement, 2013
Local dependence (LD) for binary IRT models can be diagnosed using Chen and Thissen's bivariate X[superscript 2] statistic and the score test statistics proposed by Glas and Suarez-Falcon, and Liu and Thissen. Alternatively, LD can be assessed using general purpose statistics such as bivariate residuals or Maydeu-Olivares and Joe's M[subscript r]…
Descriptors: Item Response Theory, Statistical Analysis, Models, Goodness of Fit
Peer reviewed Peer reviewed
Direct linkDirect link
Yang, Ji Seung; Hansen, Mark; Cai, Li – Educational and Psychological Measurement, 2012
Traditional estimators of item response theory scale scores ignore uncertainty carried over from the item calibration process, which can lead to incorrect estimates of the standard errors of measurement (SEMs). Here, the authors review a variety of approaches that have been applied to this problem and compare them on the basis of their statistical…
Descriptors: Item Response Theory, Scores, Statistical Analysis, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Schmitt, Thomas A.; Sass, Daniel A. – Educational and Psychological Measurement, 2011
Exploratory factor analysis (EFA) has long been used in the social sciences to depict the relationships between variables/items and latent traits. Researchers face many choices when using EFA, including the choice of rotation criterion, which can be difficult given that few research articles have discussed and/or demonstrated their differences.…
Descriptors: Hypothesis Testing, Factor Analysis, Correlation, Criteria
Peer reviewed Peer reviewed
Direct linkDirect link
Vaughn, Brandon K.; Wang, Qiu – Educational and Psychological Measurement, 2010
A nonparametric tree classification procedure is used to detect differential item functioning for items that are dichotomously scored. Classification trees are shown to be an alternative procedure to detect differential item functioning other than the use of traditional Mantel-Haenszel and logistic regression analysis. A nonparametric…
Descriptors: Test Bias, Classification, Nonparametric Statistics, Regression (Statistics)
Peer reviewed Peer reviewed
Direct linkDirect link
Carvajal, Jorge; Skorupski, William P. – Educational and Psychological Measurement, 2010
This study is an evaluation of the behavior of the Liu-Agresti estimator of the cumulative common odds ratio when identifying differential item functioning (DIF) with polytomously scored test items using small samples. The Liu-Agresti estimator has been proposed by Penfield and Algina as a promising approach for the study of polytomous DIF but no…
Descriptors: Test Bias, Sample Size, Test Items, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
DeMars, Christine E. – Educational and Psychological Measurement, 2010
In this brief explication, two challenges for using differential item functioning (DIF) measures when there are large group differences in true proficiency are illustrated. Each of these difficulties may lead to inflated Type I error rates, for very different reasons. One problem is that groups matched on observed score are not necessarily well…
Descriptors: Test Bias, Error of Measurement, Regression (Statistics), Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Yin, Ping; Sconing, James – Educational and Psychological Measurement, 2008
Standard-setting methods are widely used to determine cut scores on a test that examinees must meet for a certain performance standard. Because standard setting is a measurement procedure, it is important to evaluate variability of cut scores resulting from the standard-setting process. Generalizability theory is used in this study to estimate…
Descriptors: Generalizability Theory, Standard Setting, Cutting Scores, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
MacCann, Robert G. – Educational and Psychological Measurement, 2008
It is shown that the Angoff and bookmarking cut scores are examples of true score equating that in the real world must be applied to observed scores. In the context of defining minimal competency, the percentage "failed" by such methods is a function of the length of the measuring instrument. It is argued that this length is largely…
Descriptors: True Scores, Cutting Scores, Minimum Competencies, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Tong, Ye; Brennan, Robert L. – Educational and Psychological Measurement, 2007
Estimating standard errors of estimated variance components has long been a challenging task in generalizability theory. Researchers have speculated about the potential applicability of the bootstrap for obtaining such estimates, but they have identified problems (especially bias) in using the bootstrap. Using Brennan's bias-correcting procedures…
Descriptors: Error of Measurement, Generalizability Theory, Computation, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Brennan, Robert L. – Educational and Psychological Measurement, 2007
This article provides general procedures for obtaining unbiased estimates of variance components for any random-model balanced design under any bootstrap sampling plan, with the focus on designs of the type typically used in generalizability theory. The results reported here are particularly helpful when the bootstrap is used to estimate standard…
Descriptors: Generalizability Theory, Error of Measurement, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Zimmerman, Donald W. – Educational and Psychological Measurement, 2007
Properties of the Spearman correction for attenuation were investigated using Monte Carlo methods, under conditions where correlations between error scores exist as a population parameter and also where correlated errors arise by chance in random sampling. Equations allowing for all possible dependence among true and error scores on two tests at…
Descriptors: Monte Carlo Methods, Correlation, Sampling, Data Analysis
Previous Page | Next Page ยป
Pages: 1  |  2  |  3  |  4