NotesFAQContact Us
Collection
Advanced
Search Tips
Source
Educational and Psychological…382
Audience
Laws, Policies, & Programs
No Child Left Behind Act 20011
What Works Clearinghouse Rating
Showing 1 to 15 of 382 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Betts, Joe; Muntean, William; Kim, Doyoung; Kao, Shu-chuan – Educational and Psychological Measurement, 2022
The multiple response structure can underlie several different technology-enhanced item types. With the increased use of computer-based testing, multiple response items are becoming more common. This response type holds the potential for being scored polytomously for partial credit. However, there are several possible methods for computing raw…
Descriptors: Scoring, Test Items, Test Format, Raw Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Dimitrov, Dimiter M.; Atanasov, Dimitar V. – Educational and Psychological Measurement, 2022
This study offers an approach to testing for differential item functioning (DIF) in a recently developed measurement framework, referred to as "D"-scoring method (DSM). Under the proposed approach, called "P-Z" method of testing for DIF, the item response functions of two groups (reference and focal) are compared by…
Descriptors: Test Bias, Methods, Test Items, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Xue, Kang; Huggins-Manley, Anne Corinne; Leite, Walter – Educational and Psychological Measurement, 2022
In data collected from virtual learning environments (VLEs), item response theory (IRT) models can be used to guide the ongoing measurement of student ability. However, such applications of IRT rely on unbiased item parameter estimates associated with test items in the VLE. Without formal piloting of the items, one can expect a large amount of…
Descriptors: Virtual Classrooms, Artificial Intelligence, Item Response Theory, Item Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Ames, Allison J. – Educational and Psychological Measurement, 2022
Individual response style behaviors, unrelated to the latent trait of interest, may influence responses to ordinal survey items. Response style can introduce bias in the total score with respect to the trait of interest, threatening valid interpretation of scores. Despite claims of response style stability across scales, there has been little…
Descriptors: Response Style (Tests), Individual Differences, Scores, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Liu, Xiaowen; Jane Rogers, H. – Educational and Psychological Measurement, 2022
Test fairness is critical to the validity of group comparisons involving gender, ethnicities, culture, or treatment conditions. Detection of differential item functioning (DIF) is one component of efforts to ensure test fairness. The current study compared four treatments for items that have been identified as showing DIF: deleting, ignoring,…
Descriptors: Item Analysis, Comparative Analysis, Culture Fair Tests, Test Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Weese, James D.; Turner, Ronna C.; Ames, Allison; Crawford, Brandon; Liang, Xinya – Educational and Psychological Measurement, 2022
A simulation study was conducted to investigate the heuristics of the SIBTEST procedure and how it compares with ETS classification guidelines used with the Mantel-Haenszel procedure. Prior heuristics have been used for nearly 25 years, but they are based on a simulation study that was restricted due to computer limitations and that modeled item…
Descriptors: Test Bias, Heuristics, Classification, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Menold, Natalja; Raykov, Tenko – Educational and Psychological Measurement, 2022
The possible dependency of criterion validity on item formulation in a multicomponent measuring instrument is examined. The discussion is concerned with evaluation of the differences in criterion validity between two or more groups (populations/subpopulations) that have been administered instruments with items having differently formulated item…
Descriptors: Test Items, Measures (Individuals), Test Validity, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Guastadisegni, Lucia; Cagnone, Silvia; Moustaki, Irini; Vasdekis, Vassilis – Educational and Psychological Measurement, 2022
This article studies the Type I error, false positive rates, and power of four versions of the Lagrange multiplier test to detect measurement noninvariance in item response theory (IRT) models for binary data under model misspecification. The tests considered are the Lagrange multiplier test computed with the Hessian and cross-product approach,…
Descriptors: Measurement, Statistical Analysis, Item Response Theory, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Kang, Hyeon-Ah; Han, Suhwa; Kim, Doyoung; Kao, Shu-Chuan – Educational and Psychological Measurement, 2022
The development of technology-enhanced innovative items calls for practical models that can describe polytomous testlet items. In this study, we evaluate four measurement models that can characterize polytomous items administered in testlets: (a) generalized partial credit model (GPCM), (b) testlet-as-a-polytomous-item model (TPIM), (c)…
Descriptors: Goodness of Fit, Item Response Theory, Test Items, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, Chansoon; Qian, Hong – Educational and Psychological Measurement, 2022
Using classical test theory and item response theory, this study applied sequential procedures to a real operational item pool in a variable-length computerized adaptive testing (CAT) to detect items whose security may be compromised. Moreover, this study proposed a hybrid threshold approach to improve the detection power of the sequential…
Descriptors: Computer Assisted Testing, Adaptive Testing, Licensing Examinations (Professions), Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Wind, Stefanie A. – Educational and Psychological Measurement, 2022
Researchers frequently use Mokken scale analysis (MSA), which is a nonparametric approach to item response theory, when they have relatively small samples of examinees. Researchers have provided some guidance regarding the minimum sample size for applications of MSA under various conditions. However, these studies have not focused on item-level…
Descriptors: Nonparametric Statistics, Item Response Theory, Sample Size, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Cooperman, Allison W.; Weiss, David J.; Wang, Chun – Educational and Psychological Measurement, 2022
Adaptive measurement of change (AMC) is a psychometric method for measuring intra-individual change on one or more latent traits across testing occasions. Three hypothesis tests--a Z test, likelihood ratio test, and score ratio index--have demonstrated desirable statistical properties in this context, including low false positive rates and high…
Descriptors: Error of Measurement, Psychometrics, Hypothesis Testing, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Brennan, Robert L.; Kim, Stella Y.; Lee, Won-Chan – Educational and Psychological Measurement, 2022
This article extends multivariate generalizability theory (MGT) to tests with different random-effects designs for each level of a fixed facet. There are numerous situations in which the design of a test and the resulting data structure are not definable by a single design. One example is mixed-format tests that are composed of multiple-choice and…
Descriptors: Multivariate Analysis, Generalizability Theory, Multiple Choice Tests, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Nagy, Gabriel; Ulitzsch, Esther – Educational and Psychological Measurement, 2022
Disengaged item responses pose a threat to the validity of the results provided by large-scale assessments. Several procedures for identifying disengaged responses on the basis of observed response times have been suggested, and item response theory (IRT) models for response engagement have been proposed. We outline that response time-based…
Descriptors: Item Response Theory, Hierarchical Linear Modeling, Predictor Variables, Classification
Peer reviewed Peer reviewed
Direct linkDirect link
Gorgun, Guher; Bulut, Okan – Educational and Psychological Measurement, 2021
In low-stakes assessments, some students may not reach the end of the test and leave some items unanswered due to various reasons (e.g., lack of test-taking motivation, poor time management, and test speededness). Not-reached items are often treated as incorrect or not-administered in the scoring process. However, when the proportion of…
Descriptors: Scoring, Test Items, Response Style (Tests), Mathematics Tests
Previous Page | Next Page ยป
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  26