Publication Date
| In 2015 | 1 |
| Since 2014 | 3 |
| Since 2011 (last 5 years) | 10 |
| Since 2006 (last 10 years) | 29 |
| Since 1996 (last 20 years) | 34 |
Descriptor
| Item Analysis | 229 |
| Test Items | 67 |
| Test Construction | 53 |
| Test Reliability | 52 |
| Test Validity | 52 |
| Correlation | 38 |
| Factor Analysis | 37 |
| Higher Education | 31 |
| Statistical Analysis | 30 |
| Computer Programs | 27 |
| More ▼ | |
Author
| Aiken, Lewis R. | 6 |
| Vegelius, Jan | 5 |
| Fiske, Donald W. | 4 |
| Plake, Barbara S. | 4 |
| Bart, William M. | 3 |
| Dawis, Rene V. | 3 |
| Harris, Deborah J. | 3 |
| Jackson, Douglas N. | 3 |
| Kolen, Michael J. | 3 |
| Krus, David J. | 3 |
| More ▼ | |
Publication Type
Education Level
Audience
| Practitioners | 1 |
Showing 1 to 15 of 229 results
Kopf, Julia; Zeileis, Achim; Strobl, Carolin – Educational and Psychological Measurement, 2015
Differential item functioning (DIF) indicates the violation of the invariance assumption, for instance, in models based on item response theory (IRT). For item-wise DIF analysis using IRT, a common metric for the item parameters of the groups that are to be compared (e.g., for the reference and the focal group) is necessary. In the Rasch model,…
Descriptors: Test Items, Equated Scores, Test Bias, Item Response Theory
Lee, HwaYoung; Beretvas, S. Natasha – Educational and Psychological Measurement, 2014
Conventional differential item functioning (DIF) detection methods (e.g., the Mantel-Haenszel test) can be used to detect DIF only across observed groups, such as gender or ethnicity. However, research has found that DIF is not typically fully explained by an observed variable. True sources of DIF may include unobserved, latent variables, such as…
Descriptors: Item Analysis, Factor Structure, Bayesian Statistics, Goodness of Fit
Jones, W. Paul – Educational and Psychological Measurement, 2014
A study in a university clinic/laboratory investigated adaptive Bayesian scaling as a supplement to interpretation of scores on the Mini-IPIP. A "probability of belonging" in categories of low, medium, or high on each of the Big Five traits was calculated after each item response and continued until all items had been used or until a…
Descriptors: Personality Traits, Personality Measures, Bayesian Statistics, Clinics
Keeley, Jared W.; English, Taylor; Irons, Jessica; Henslee, Amber M. – Educational and Psychological Measurement, 2013
Many measurement biases affect student evaluations of instruction (SEIs). However, two have been relatively understudied: halo effects and ceiling/floor effects. This study examined these effects in two ways. To examine the halo effect, using a videotaped lecture, we manipulated specific teacher behaviors to be "good" or "bad"…
Descriptors: Robustness (Statistics), Test Bias, Course Evaluation, Student Evaluation of Teacher Performance
Albano, Anthony D.; Rodriguez, Michael C. – Educational and Psychological Measurement, 2013
Although a substantial amount of research has been conducted on differential item functioning in testing, studies have focused on detecting differential item functioning rather than on explaining how or why it may occur. Some recent work has explored sources of differential functioning using explanatory and multilevel item response models. This…
Descriptors: Test Bias, Hierarchical Linear Modeling, Gender Differences, Educational Opportunities
Gómez-Benito, Juana; Hidalgo, Maria Dolores; Zumbo, Bruno D. – Educational and Psychological Measurement, 2013
The objective of this article was to find an optimal decision rule for identifying polytomous items with large or moderate amounts of differential functioning. The effectiveness of combining statistical tests with effect size measures was assessed using logistic discriminant function analysis and two effect size measures: R[superscript 2] and…
Descriptors: Item Analysis, Test Items, Effect Size, Statistical Analysis
Kobrin, Jennifer L.; Kim, YoungKoung; Sackett, Paul R. – Educational and Psychological Measurement, 2012
There is much debate on the merits and pitfalls of standardized tests for college admission, with questions regarding the format (multiple-choice vs. constructed response), cognitive complexity, and content of these assessments (achievement vs. aptitude) at the forefront of the discussion. This study addressed these questions by investigating the…
Descriptors: Grade Point Average, Standardized Tests, Predictive Validity, Predictor Variables
Moyer, Eric L.; Galindo, Jennifer L.; Dodd, Barbara G. – Educational and Psychological Measurement, 2012
Managing test specifications--both multiple nonstatistical constraints and flexibly defined constraints--has become an important part of designing item selection procedures for computerized adaptive tests (CATs) in achievement testing. This study compared the effectiveness of three procedures: constrained CAT, flexible modified constrained CAT,…
Descriptors: Adaptive Testing, Computer Assisted Testing, Test Items, Item Analysis
Stone, Gregory Ethan; Koskey, Kristin L. K.; Sondergeld, Toni A. – Educational and Psychological Measurement, 2011
Typical validation studies on standard setting models, most notably the Angoff and modified Angoff models, have ignored construct development, a critical aspect associated with all conceptualizations of measurement processes. Stone compared the Angoff and objective standard setting (OSS) models and found that Angoff failed to define a legitimate…
Descriptors: Cutting Scores, Standard Setting (Scoring), Models, Construct Validity
Wang, Wen-Chung; Huang, Sheng-Yun – Educational and Psychological Measurement, 2011
The one-parameter logistic model with ability-based guessing (1PL-AG) has been recently developed to account for effect of ability on guessing behavior in multiple-choice items. In this study, the authors developed algorithms for computerized classification testing under the 1PL-AG and conducted a series of simulations to evaluate their…
Descriptors: Computer Assisted Testing, Classification, Item Analysis, Probability
Leite, Walter L.; Svinicki, Marilla; Shi, Yuying – Educational and Psychological Measurement, 2010
The authors examined the dimensionality of the VARK learning styles inventory. The VARK measures four perceptual preferences: visual (V), aural (A), read/write (R), and kinesthetic (K). VARK questions can be viewed as testlets because respondents can select multiple items within a question. The correlations between items within testlets are a type…
Descriptors: Multitrait Multimethod Techniques, Construct Validity, Reliability, Factor Analysis
Hull, Darrell M.; Beaujean, A. Alexander; Worrell, Frank C.; Verdisco, Aimee E. – Educational and Psychological Measurement, 2010
The NEO Five-Factor Inventory (NEO-FFI) is often used in field-based research and clinical studies as it is designed to measure the same personality dimensions as the longer NEO Personality Inventory in a shorter time frame. In this study, the authors examined the reliability and structural validity of the NEO-FFI scores at the item level in a…
Descriptors: Personality Traits, Comparative Analysis, Item Analysis, Validity
Cheng, Ying – Educational and Psychological Measurement, 2010
This article proposes a new item selection method, namely, the modified maximum global discrimination index (MMGDI) method, for cognitive diagnostic computerized adaptive testing (CD-CAT). The new method captures two aspects of the appeal of an item: (a) the amount of contribution it can make toward adequate coverage of every attribute and (b) the…
Descriptors: Cognitive Tests, Diagnostic Tests, Computer Assisted Testing, Adaptive Testing
Van Eck, Kathryn; Finney, Sara J.; Evans, Steven W. – Educational and Psychological Measurement, 2010
The Disruptive Behavior Disorders (DBD) scale includes the "Diagnostic and Statistical Manual of Mental Disorders" (4th ed.) criteria for attention deficit hyperactivity disorder (ADHD), oppositional defiant disorder, and conduct disorder. This study examined only the ADHD items of the DBD scale. This scale is frequently used for assessing parent-…
Descriptors: Mental Disorders, Evaluation Criteria, Behavior Disorders, Attention Deficit Hyperactivity Disorder
Yoo, Jin Eun – Educational and Psychological Measurement, 2009
This Monte Carlo study investigates the beneficiary effect of including auxiliary variables during estimation of confirmatory factor analysis models with multiple imputation. Specifically, it examines the influence of sample size, missing rates, missingness mechanism combinations, missingness types (linear or convex), and the absence or presence…
Descriptors: Monte Carlo Methods, Research Methodology, Test Validity, Factor Analysis

Peer reviewed
Direct link
