Publication Date
In 2024 | 3 |
Since 2023 | 22 |
Since 2020 (last 5 years) | 62 |
Since 2015 (last 10 years) | 125 |
Since 2005 (last 20 years) | 234 |
Descriptor
Test Items | 406 |
Item Response Theory | 150 |
Item Analysis | 93 |
Difficulty Level | 87 |
Statistical Analysis | 77 |
Test Construction | 77 |
Test Bias | 67 |
Models | 64 |
Simulation | 61 |
Comparative Analysis | 59 |
Computation | 53 |
More ▼ |
Source
Educational and Psychological… | 406 |
Author
Wang, Wen-Chung | 11 |
Dimitrov, Dimiter M. | 9 |
Plake, Barbara S. | 8 |
Raykov, Tenko | 8 |
Zumbo, Bruno D. | 8 |
Wilson, Mark | 7 |
Huggins-Manley, Anne Corinne | 6 |
Marcoulides, George A. | 6 |
Paek, Insu | 5 |
Aiken, Lewis R. | 4 |
Cheng, Ying | 4 |
More ▼ |
Publication Type
Education Level
Audience
Location
Germany | 8 |
Canada | 6 |
Australia | 5 |
Saudi Arabia | 5 |
United States | 5 |
South Korea | 4 |
Taiwan | 4 |
Florida | 3 |
Hong Kong | 3 |
Israel | 3 |
Netherlands | 3 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Dimitrov, Dimiter M.; Luo, Yong – Educational and Psychological Measurement, 2019
An approach to scoring tests with binary items, referred to as D-scoring method, was previously developed as a classical analog to basic models in item response theory (IRT) for binary items. As some tests include polytomous items, this study offers an approach to D-scoring of such items and parallels the results with those obtained under the…
Descriptors: Scoring, Test Items, Item Response Theory, Psychometrics
DiStefano, Christine; McDaniel, Heather L.; Zhang, Liyun; Shi, Dexin; Jiang, Zhehan – Educational and Psychological Measurement, 2019
A simulation study was conducted to investigate the model size effect when confirmatory factor analysis (CFA) models include many ordinal items. CFA models including between 15 and 120 ordinal items were analyzed with mean- and variance-adjusted weighted least squares to determine how varying sample size, number of ordered categories, and…
Descriptors: Factor Analysis, Effect Size, Data, Sample Size
Cetin-Berber, Dee Duygu; Sari, Halil Ibrahim; Huggins-Manley, Anne Corinne – Educational and Psychological Measurement, 2019
Routing examinees to modules based on their ability level is a very important aspect in computerized adaptive multistage testing. However, the presence of missing responses may complicate estimation of examinee ability, which may result in misrouting of individuals. Therefore, missing responses should be handled carefully. This study investigated…
Descriptors: Computer Assisted Testing, Adaptive Testing, Error of Measurement, Research Problems
Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2019
This note discusses the merits of coefficient alpha and their conditions in light of recent critical publications that miss out on significant research findings over the past several decades. That earlier research has demonstrated the empirical relevance and utility of coefficient alpha under certain empirical circumstances. The article highlights…
Descriptors: Test Validity, Test Reliability, Test Items, Correlation
List, Marit Kristine; Köller, Olaf; Nagy, Gabriel – Educational and Psychological Measurement, 2019
Tests administered in studies of student achievement often have a certain amount of not-reached items (NRIs). The propensity for NRIs may depend on the proficiency measured by the test and on additional covariates. This article proposes a semiparametric model to study such relationships. Our model extends Glas and Pimentel's item response theory…
Descriptors: Educational Assessment, Item Response Theory, Multivariate Analysis, Test Items
Raykov, Tenko; Marcoulides, George A.; Dimitrov, Dimiter M.; Li, Tatyana – Educational and Psychological Measurement, 2018
This article extends the procedure outlined in the article by Raykov, Marcoulides, and Tong for testing congruence of latent constructs to the setting of binary items and clustering effects. In this widely used setting in contemporary educational and psychological research, the method can be used to examine if two or more homogeneous…
Descriptors: Tests, Psychometrics, Test Items, Construct Validity
Is the Factor Observed in Investigations on the Item-Position Effect Actually the Difficulty Factor?
Schweizer, Karl; Troche, Stefan – Educational and Psychological Measurement, 2018
In confirmatory factor analysis quite similar models of measurement serve the detection of the difficulty factor and the factor due to the item-position effect. The item-position effect refers to the increasing dependency among the responses to successively presented items of a test whereas the difficulty factor is ascribed to the wide range of…
Descriptors: Investigations, Difficulty Level, Factor Analysis, Models
Andersson, Björn; Xin, Tao – Educational and Psychological Measurement, 2018
In applications of item response theory (IRT), an estimate of the reliability of the ability estimates or sum scores is often reported. However, analytical expressions for the standard errors of the estimators of the reliability coefficients are not available in the literature and therefore the variability associated with the estimated reliability…
Descriptors: Item Response Theory, Test Reliability, Test Items, Scores
Wang, Yan; Kim, Eun Sook; Dedrick, Robert F.; Ferron, John M.; Tan, Tony – Educational and Psychological Measurement, 2018
Wording effects associated with positively and negatively worded items have been found in many scales. Such effects may threaten construct validity and introduce systematic bias in the interpretation of results. A variety of models have been applied to address wording effects, such as the correlated uniqueness model and the correlated traits and…
Descriptors: Test Items, Test Format, Correlation, Construct Validity
Zijlmans, Eva A. O.; Tijmstra, Jesper; van der Ark, L. Andries; Sijtsma, Klaas – Educational and Psychological Measurement, 2018
Reliability is usually estimated for a total score, but it can also be estimated for item scores. Item-score reliability can be useful to assess the repeatability of an individual item score in a group. Three methods to estimate item-score reliability are discussed, known as method MS, method [lambda][subscript 6], and method CA. The item-score…
Descriptors: Test Items, Test Reliability, Correlation, Comparative Analysis
Luo, Yong – Educational and Psychological Measurement, 2018
Mplus is a powerful latent variable modeling software program that has become an increasingly popular choice for fitting complex item response theory models. In this short note, we demonstrate that the two-parameter logistic testlet model can be estimated as a constrained bifactor model in Mplus with three estimators encompassing limited- and…
Descriptors: Computer Software, Models, Statistical Analysis, Computation
Liu, Ren – Educational and Psychological Measurement, 2018
Attribute structure is an explicit way of presenting the relationship between attributes in diagnostic measurement. The specification of attribute structures directly affects the classification accuracy resulted from psychometric modeling. This study provides a conceptual framework for understanding misspecifications of attribute structures. Under…
Descriptors: Diagnostic Tests, Classification, Test Construction, Relationship
Lin, Yin; Brown, Anna – Educational and Psychological Measurement, 2017
A fundamental assumption in computerized adaptive testing is that item parameters are invariant with respect to context--items surrounding the administered item. This assumption, however, may not hold in forced-choice (FC) assessments, where explicit comparisons are made between items included in the same block. We empirically examined the…
Descriptors: Personality Measures, Measurement Techniques, Context Effect, Test Items
Lee, Soo; Bulut, Okan; Suh, Youngsuk – Educational and Psychological Measurement, 2017
A number of studies have found multiple indicators multiple causes (MIMIC) models to be an effective tool in detecting uniform differential item functioning (DIF) for individual items and item bundles. A recently developed MIMIC-interaction model is capable of detecting both uniform and nonuniform DIF in the unidimensional item response theory…
Descriptors: Test Bias, Test Items, Models, Item Response Theory
Cao, Mengyang; Tay, Louis; Liu, Yaowu – Educational and Psychological Measurement, 2017
This study examined the performance of a proposed iterative Wald approach for detecting differential item functioning (DIF) between two groups when preknowledge of anchor items is absent. The iterative approach utilizes the Wald-2 approach to identify anchor items and then iteratively tests for DIF items with the Wald-1 approach. Monte Carlo…
Descriptors: Monte Carlo Methods, Test Items, Test Bias, Error of Measurement