NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 15 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Mangino, Anthony A.; Bolin, Jocelyn H.; Finch, W. Holmes – Educational and Psychological Measurement, 2023
This study seeks to compare fixed and mixed effects models for the purposes of predictive classification in the presence of multilevel data. The first part of the study utilizes a Monte Carlo simulation to compare fixed and mixed effects logistic regression and random forests. An applied examination of the prediction of student retention in the…
Descriptors: Prediction, Classification, Monte Carlo Methods, Foreign Countries
Peer reviewed Peer reviewed
Direct linkDirect link
Liu, Ivy; Suesse, Thomas; Harvey, Samuel; Gu, Peter Yongqi; Fernández, Daniel; Randal, John – Educational and Psychological Measurement, 2023
The Mantel-Haenszel estimator is one of the most popular techniques for measuring differential item functioning (DIF). A generalization of this estimator is applied to the context of DIF to compare items by taking the covariance of odds ratio estimators between dependent items into account. Unlike the Item Response Theory, the method does not rely…
Descriptors: Test Bias, Computation, Statistical Analysis, Achievement Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Xiao, Leifeng; Hau, Kit-Tai – Educational and Psychological Measurement, 2023
We examined the performance of coefficient alpha and its potential competitors (ordinal alpha, omega total, Revelle's omega total [omega RT], omega hierarchical [omega h], greatest lower bound [GLB], and coefficient "H") with continuous and discrete data having different types of non-normality. Results showed the estimation bias was…
Descriptors: Statistical Bias, Statistical Analysis, Likert Scales, Statistical Distributions
Peer reviewed Peer reviewed
Direct linkDirect link
Rios, Joseph A. – Educational and Psychological Measurement, 2021
Low test-taking effort as a validity threat is common when examinees perceive an assessment context to have minimal personal value. Prior research has shown that in such contexts, subgroups may differ in their effort, which raises two concerns when making subgroup mean comparisons. First, it is unclear how differential effort could influence…
Descriptors: Response Style (Tests), Statistical Analysis, Measurement, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Huang, Hung-Yu – Educational and Psychological Measurement, 2020
In educational assessments and achievement tests, test developers and administrators commonly assume that test-takers attempt all test items with full effort and leave no blank responses with unplanned missing values. However, aberrant response behavior--such as performance decline, dropping out beyond a certain point, and skipping certain items…
Descriptors: Item Response Theory, Response Style (Tests), Test Items, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Liu, Yuan; Hau, Kit-Tai – Educational and Psychological Measurement, 2020
In large-scale low-stake assessment such as the Programme for International Student Assessment (PISA), students may skip items (missingness) which are within their ability to complete. The detection and taking care of these noneffortful responses, as a measure of test-taking motivation, is an important issue in modern psychometric models.…
Descriptors: Response Style (Tests), Motivation, Test Items, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Sachse, Karoline A.; Mahler, Nicole; Pohl, Steffi – Educational and Psychological Measurement, 2019
Mechanisms causing item nonresponses in large-scale assessments are often said to be nonignorable. Parameter estimates can be biased if nonignorable missing data mechanisms are not adequately modeled. In trend analyses, it is plausible for the missing data mechanism and the percentage of missing values to change over time. In this article, we…
Descriptors: International Assessment, Response Style (Tests), Achievement Tests, Foreign Countries
Peer reviewed Peer reviewed
Direct linkDirect link
Zehner, Fabian; Sälzer, Christine; Goldhammer, Frank – Educational and Psychological Measurement, 2016
Automatic coding of short text responses opens new doors in assessment. We implemented and integrated baseline methods of natural language processing and statistical modelling by means of software components that are available under open licenses. The accuracy of automatic text coding is demonstrated by using data collected in the "Programme…
Descriptors: Educational Assessment, Coding, Automation, Responses
Peer reviewed Peer reviewed
Direct linkDirect link
Wang, Wen-Chung; Chen, Hui-Fang; Jin, Kuan-Yu – Educational and Psychological Measurement, 2015
Many scales contain both positively and negatively worded items. Reverse recoding of negatively worded items might not be enough for them to function as positively worded items do. In this study, we commented on the drawbacks of existing approaches to wording effect in mixed-format scales and used bi-factor item response theory (IRT) models to…
Descriptors: Item Response Theory, Test Format, Language Usage, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Okumura, Taichi – Educational and Psychological Measurement, 2014
This study examined the empirical differences between the tendency to omit items and reading ability by applying tree-based item response (IRTree) models to the Japanese data of the Programme for International Student Assessment (PISA) held in 2009. For this purpose, existing IRTree models were expanded to contain predictors and to handle…
Descriptors: Foreign Countries, Item Response Theory, Test Items, Reading Ability
Peer reviewed Peer reviewed
Direct linkDirect link
Segeritz, Micha; Pant, Hans Anand – Educational and Psychological Measurement, 2013
This article summarizes the key finding of a study that (a) tests the measurement invariance (MI) of the popular Students' Approaches to Learning instrument (Programme for International Student Assessment [PISA]) across ethnic/cultural groups within a country and (b) discusses implications for research focusing on the role of affective measures in…
Descriptors: Foreign Countries, Affective Measures, Immigrants, Ethnic Groups
Peer reviewed Peer reviewed
Direct linkDirect link
Tian, Wei; Cai, Li; Thissen, David; Xin, Tao – Educational and Psychological Measurement, 2013
In item response theory (IRT) modeling, the item parameter error covariance matrix plays a critical role in statistical inference procedures. When item parameters are estimated using the EM algorithm, the parameter error covariance matrix is not an automatic by-product of item calibration. Cai proposed the use of Supplemented EM algorithm for…
Descriptors: Item Response Theory, Computation, Matrices, Statistical Inference
Peer reviewed Peer reviewed
Direct linkDirect link
Yang, Ji Seung; Hansen, Mark; Cai, Li – Educational and Psychological Measurement, 2012
Traditional estimators of item response theory scale scores ignore uncertainty carried over from the item calibration process, which can lead to incorrect estimates of the standard errors of measurement (SEMs). Here, the authors review a variety of approaches that have been applied to this problem and compare them on the basis of their statistical…
Descriptors: Item Response Theory, Scores, Statistical Analysis, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Frey, Andreas; Seitz, Nicki-Nils – Educational and Psychological Measurement, 2011
The usefulness of multidimensional adaptive testing (MAT) for the assessment of student literacy in the Programme for International Student Assessment (PISA) was examined within a real data simulation study. The responses of N = 14,624 students who participated in the PISA assessments of the years 2000, 2003, and 2006 in Germany were used to…
Descriptors: Adaptive Testing, Literacy, Academic Achievement, Achievement Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Bolt, Daniel M.; Newton, Joseph R. – Educational and Psychological Measurement, 2011
This article extends a methodological approach considered by Bolt and Johnson for the measurement and control of extreme response style (ERS) to the analysis of rating data from multiple scales. Specifically, it is shown how the simultaneous analysis of item responses across scales allows for more accurate identification of ERS, and more effective…
Descriptors: Response Style (Tests), Measurement, Item Response Theory, Accuracy