NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 15 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Huang, Qi; Bolt, Daniel M. – Educational and Psychological Measurement, 2023
Previous studies have demonstrated evidence of latent skill continuity even in tests intentionally designed for measurement of binary skills. In addition, the assumption of binary skills when continuity is present has been shown to potentially create a lack of invariance in item and latent ability parameters that may undermine applications. In…
Descriptors: Item Response Theory, Test Items, Skill Development, Robustness (Statistics)
Peer reviewed Peer reviewed
Direct linkDirect link
von Davier, Matthias; Bezirhan, Ummugul – Educational and Psychological Measurement, 2023
Viable methods for the identification of item misfit or Differential Item Functioning (DIF) are central to scale construction and sound measurement. Many approaches rely on the derivation of a limiting distribution under the assumption that a certain model fits the data perfectly. Typical DIF assumptions such as the monotonicity and population…
Descriptors: Robustness (Statistics), Test Items, Item Analysis, Goodness of Fit
Peer reviewed Peer reviewed
Direct linkDirect link
Bogaert, Jasper; Loh, Wen Wei; Rosseel, Yves – Educational and Psychological Measurement, 2023
Factor score regression (FSR) is widely used as a convenient alternative to traditional structural equation modeling (SEM) for assessing structural relations between latent variables. But when latent variables are simply replaced by factor scores, biases in the structural parameter estimates often have to be corrected, due to the measurement error…
Descriptors: Factor Analysis, Regression (Statistics), Structural Equation Models, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Wang, Yan; Kim, Eunsook; Yi, Zhiyao – Educational and Psychological Measurement, 2022
Latent profile analysis (LPA) identifies heterogeneous subgroups based on continuous indicators that represent different dimensions. It is a common practice to measure each dimension using items, create composite or factor scores for each dimension, and use these scores as indicators of profiles in LPA. In this case, measurement models for…
Descriptors: Robustness (Statistics), Profiles, Statistical Analysis, Classification
Peer reviewed Peer reviewed
Direct linkDirect link
Manapat, Patrick D.; Edwards, Michael C. – Educational and Psychological Measurement, 2022
When fitting unidimensional item response theory (IRT) models, the population distribution of the latent trait ([theta]) is often assumed to be normally distributed. However, some psychological theories would suggest a nonnormal [theta]. For example, some clinical traits (e.g., alcoholism, depression) are believed to follow a positively skewed…
Descriptors: Robustness (Statistics), Computational Linguistics, Item Response Theory, Psychological Patterns
Peer reviewed Peer reviewed
Direct linkDirect link
Shin, Myungho; No, Unkyung; Hong, Sehee – Educational and Psychological Measurement, 2019
The present study aims to compare the robustness under various conditions of latent class analysis mixture modeling approaches that deal with auxiliary distal outcomes. Monte Carlo simulations were employed to test the performance of four approaches recommended by previous simulation studies: maximum likelihood (ML) assuming homoskedasticity…
Descriptors: Robustness (Statistics), Multivariate Analysis, Maximum Likelihood Statistics, Statistical Distributions
Peer reviewed Peer reviewed
Direct linkDirect link
Debelak, Rudolf; Strobl, Carolin – Educational and Psychological Measurement, 2019
M-fluctuation tests are a recently proposed method for detecting differential item functioning in Rasch models. This article discusses a generalization of this method to two additional item response theory models: the two-parametric logistic model and the three-parametric logistic model with a common guessing parameter. The Type I error rate and…
Descriptors: Test Bias, Item Response Theory, Statistical Analysis, Maximum Likelihood Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Wilcox, Rand R.; Serang, Sarfaraz – Educational and Psychological Measurement, 2017
The article provides perspectives on p values, null hypothesis testing, and alternative techniques in light of modern robust statistical methods. Null hypothesis testing and "p" values can provide useful information provided they are interpreted in a sound manner, which includes taking into account insights and advances that have…
Descriptors: Hypothesis Testing, Bayesian Statistics, Computation, Effect Size
Peer reviewed Peer reviewed
Direct linkDirect link
Wang, Yan; Rodríguez de Gil, Patricia; Chen, Yi-Hsin; Kromrey, Jeffrey D.; Kim, Eun Sook; Pham, Thanh; Nguyen, Diep; Romano, Jeanine L. – Educational and Psychological Measurement, 2017
Various tests to check the homogeneity of variance assumption have been proposed in the literature, yet there is no consensus as to their robustness when the assumption of normality does not hold. This simulation study evaluated the performance of 14 tests for the homogeneity of variance assumption in one-way ANOVA models in terms of Type I error…
Descriptors: Comparative Analysis, Statistical Analysis, Robustness (Statistics), Observation
Peer reviewed Peer reviewed
Direct linkDirect link
Zhang, Zhiyong; Yuan, Ke-Hai – Educational and Psychological Measurement, 2016
Cronbach's coefficient alpha is a widely used reliability measure in social, behavioral, and education sciences. It is reported in nearly every study that involves measuring a construct through multiple items. With non-tau-equivalent items, McDonald's omega has been used as a popular alternative to alpha in the literature. Traditional estimation…
Descriptors: Computation, Statistical Analysis, Robustness (Statistics), Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Olvera Astivia, Oscar L.; Zumbo, Bruno D. – Educational and Psychological Measurement, 2015
To further understand the properties of data-generation algorithms for multivariate, nonnormal data, two Monte Carlo simulation studies comparing the Vale and Maurelli method and the Headrick fifth-order polynomial method were implemented. Combinations of skewness and kurtosis found in four published articles were run and attention was…
Descriptors: Data, Simulation, Monte Carlo Methods, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Paulhus, Delroy L.; Dubois, Patrick J. – Educational and Psychological Measurement, 2014
The overclaiming technique is a novel assessment procedure that uses signal detection analysis to generate indices of knowledge accuracy (OC-accuracy) and self-enhancement (OC-bias). The technique has previously shown robustness over varied knowledge domains as well as low reactivity across administration contexts. Here we compared the OC-accuracy…
Descriptors: Educational Assessment, Knowledge Level, Accuracy, Cognitive Ability
Peer reviewed Peer reviewed
Direct linkDirect link
Keeley, Jared W.; English, Taylor; Irons, Jessica; Henslee, Amber M. – Educational and Psychological Measurement, 2013
Many measurement biases affect student evaluations of instruction (SEIs). However, two have been relatively understudied: halo effects and ceiling/floor effects. This study examined these effects in two ways. To examine the halo effect, using a videotaped lecture, we manipulated specific teacher behaviors to be "good" or "bad"…
Descriptors: Robustness (Statistics), Test Bias, Course Evaluation, Student Evaluation of Teacher Performance
Peer reviewed Peer reviewed
Direct linkDirect link
Magis, David; De Boeck, Paul – Educational and Psychological Measurement, 2012
The identification of differential item functioning (DIF) is often performed by means of statistical approaches that consider the raw scores as proxies for the ability trait level. One of the most popular approaches, the Mantel-Haenszel (MH) method, belongs to this category. However, replacing the ability level by the simple raw score is a source…
Descriptors: Test Bias, Data, Error of Measurement, Raw Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Fife, Dustin A.; Mendoza, Jorge L.; Terry, Robert – Educational and Psychological Measurement, 2012
Though much research and attention has been directed at assessing the correlation coefficient under range restriction, the assessment of reliability under range restriction has been largely ignored. This article uses item response theory to simulate dichotomous item-level data to assess the robustness of KR-20 ([alpha]), [omega], and test-retest…
Descriptors: Reliability, Computation, Comparative Analysis, Item Response Theory