Publication Date
| In 2015 | 8 |
| Since 2014 | 55 |
| Since 2011 (last 5 years) | 206 |
Descriptor
| Item Response Theory | 78 |
| Statistical Analysis | 72 |
| Test Items | 66 |
| Computation | 55 |
| Models | 53 |
| Comparative Analysis | 44 |
| Correlation | 41 |
| Test Bias | 40 |
| Simulation | 34 |
| Error of Measurement | 29 |
| More ▼ | |
Source
| Educational and Psychological… | 206 |
Author
| Raykov, Tenko | 11 |
| Marcoulides, George A. | 9 |
| Wang, Wen-Chung | 8 |
| Cai, Li | 7 |
| Dodd, Barbara G. | 6 |
| Zumbo, Bruno D. | 6 |
| Finch, W. Holmes | 5 |
| Hancock, Gregory R. | 4 |
| Wilson, Mark | 4 |
| Beretvas, S. Natasha | 3 |
| More ▼ | |
Publication Type
| Journal Articles | 206 |
| Reports - Research | 158 |
| Reports - Evaluative | 33 |
| Reports - Descriptive | 13 |
| Information Analyses | 1 |
| Opinion Papers | 1 |
Education Level
| Higher Education | 27 |
| Postsecondary Education | 24 |
| Elementary Education | 17 |
| Secondary Education | 17 |
| Middle Schools | 10 |
| High Schools | 8 |
| Junior High Schools | 8 |
| Intermediate Grades | 7 |
| Early Childhood Education | 6 |
| Grade 3 | 6 |
| More ▼ | |
Audience
Showing 1 to 15 of 206 results
Mao, Xiulin; Harring, Jeffrey R.; Hancock, Gregory R. – Educational and Psychological Measurement, 2015
Latent interaction models have motivated a great deal of methodological research, mainly in the area of estimating such models. Product-indicator methods have been shown to be competitive with other methods of estimation in terms of parameter bias and standard error accuracy, and their continued popularity in empirical studies is due, in part, to…
Descriptors: Structural Equation Models, Error of Measurement, Algebra, Statistical Analysis
Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2015
A direct approach to point and interval estimation of Cronbach's coefficient alpha for multiple component measuring instruments is outlined. The procedure is based on a latent variable modeling application with widely circulated software. As a by-product, using sample data the method permits ascertaining whether the population discrepancy…
Descriptors: Computation, Statistical Analysis, Reliability, Models
Wang, Wen-Chung; Chen, Hui-Fang; Jin, Kuan-Yu – Educational and Psychological Measurement, 2015
Many scales contain both positively and negatively worded items. Reverse recoding of negatively worded items might not be enough for them to function as positively worded items do. In this study, we commented on the drawbacks of existing approaches to wording effect in mixed-format scales and used bi-factor item response theory (IRT) models to…
Descriptors: Item Response Theory, Test Format, Language Usage, Test Items
Lai, Emily R.; Wolfe, Edward W.; Vickers, Daisy – Educational and Psychological Measurement, 2015
This report summarizes an empirical study that addresses two related topics within the context of writing assessment--illusory halo and how much unique information is provided by multiple analytic scores. Specifically, we address the issue of whether unique information is provided by analytic scores assigned to student writing, beyond what is…
Descriptors: Writing Tests, Scores, Bias, Holistic Approach
Kopf, Julia; Zeileis, Achim; Strobl, Carolin – Educational and Psychological Measurement, 2015
Differential item functioning (DIF) indicates the violation of the invariance assumption, for instance, in models based on item response theory (IRT). For item-wise DIF analysis using IRT, a common metric for the item parameters of the groups that are to be compared (e.g., for the reference and the focal group) is necessary. In the Rasch model,…
Descriptors: Test Items, Equated Scores, Test Bias, Item Response Theory
France, Stephen L.; Batchelder, William H. – Educational and Psychological Measurement, 2015
Cultural consensus theory (CCT) is a data aggregation technique with many applications in the social and behavioral sciences. We describe the intuition and theory behind a set of CCT models for continuous type data using maximum likelihood inference methodology. We describe how bias parameters can be incorporated into these models. We introduce…
Descriptors: Maximum Likelihood Statistics, Test Items, Difficulty Level, Test Theory
Egberink, Iris J. L.; Meijer, Rob R.; Tendeiro, Jorge N. – Educational and Psychological Measurement, 2015
A popular method to assess measurement invariance of a particular item is based on likelihood ratio tests with all other items as anchor items. The results of this method are often only reported in terms of statistical significance, and researchers proposed different methods to empirically select anchor items. It is unclear, however, how many…
Descriptors: Personality Measures, Computer Assisted Testing, Measurement, Test Items
Choi, In-Hee; Wilson, Mark – Educational and Psychological Measurement, 2015
An essential feature of the linear logistic test model (LLTM) is that item difficulties are explained using item design properties. By taking advantage of this explanatory aspect of the LLTM, in a mixture extension of the LLTM, the meaning of latent classes is specified by how item properties affect item difficulties within each class. To improve…
Descriptors: Classification, Test Items, Difficulty Level, Statistical Analysis
Huang, Hung-Yu; Wang, Wen-Chung – Educational and Psychological Measurement, 2014
In the social sciences, latent traits often have a hierarchical structure, and data can be sampled from multiple levels. Both hierarchical latent traits and multilevel data can occur simultaneously. In this study, we developed a general class of item response theory models to accommodate both hierarchical latent traits and multilevel data. The…
Descriptors: Item Response Theory, Hierarchical Linear Modeling, Computation, Test Reliability
Moses, Tim – Educational and Psychological Measurement, 2014
In this study, smoothing and scaling approaches are compared for estimating subscore-to-composite scaling results involving composites computed as rounded and weighted combinations of subscores. The considered smoothing and scaling approaches included those based on raw data, on smoothing the bivariate distribution of the subscores, on smoothing…
Descriptors: Weighted Scores, Scaling, Data Analysis, Comparative Analysis
He, Wei; Reckase, Mark D. – Educational and Psychological Measurement, 2014
For computerized adaptive tests (CATs) to work well, they must have an item pool with sufficient numbers of good quality items. Many researchers have pointed out that, in developing item pools for CATs, not only is the item pool size important but also the distribution of item parameters and practical considerations such as content distribution…
Descriptors: Item Banks, Test Length, Computer Assisted Testing, Adaptive Testing
Pohl, Steffi; Gräfe, Linda; Rose, Norman – Educational and Psychological Measurement, 2014
Data from competence tests usually show a number of missing responses on test items due to both omitted and not-reached items. Different approaches for dealing with missing responses exist, and there are no clear guidelines on which of those to use. While classical approaches rely on an ignorable missing data mechanism, the most recently developed…
Descriptors: Test Items, Achievement Tests, Item Response Theory, Models
Preston, Kathleen Suzanne Johnson; Reise, Steven Paul – Educational and Psychological Measurement, 2014
The nominal response model (NRM), a much understudied polytomous item response theory (IRT) model, provides researchers the unique opportunity to evaluate within-item category distinctions. Polytomous IRT models, such as the NRM, are frequently applied to psychological assessments representing constructs that are unlikely to be normally…
Descriptors: Item Response Theory, Computation, Models, Accuracy
Dowdy, Erin; Nylund-Gibson, Karen; Felix, Erika D.; Morovati, Diane; Carnazzo, Katherine W.; Dever, Bridget V. – Educational and Psychological Measurement, 2014
The practice of screening students to identify behavioral and emotional risk is gaining momentum, with limited guidance regarding the frequency with which screenings should occur. Screening frequency decisions are influenced by the stability of the constructs assessed and changes in risk status over time. This study investigated the 4-year…
Descriptors: Screening Tests, Risk, Behavior Disorders, Emotional Disturbances
Mashburn, Andrew J.; Meyer, J. Patrick; Allen, Joseph P.; Pianta, Robert C. – Educational and Psychological Measurement, 2014
Observational methods are increasingly being used in classrooms to evaluate the quality of teaching. Operational procedures for observing teachers are somewhat arbitrary in existing measures and vary across different instruments. To study the effect of different observation procedures on score reliability and validity, we conducted an experimental…
Descriptors: Observation, Teacher Evaluation, Reliability, Validity

Peer reviewed
Direct link
