NotesFAQContact Us
Collection
Advanced
Search Tips
Source
Journal of Educational…1357
What Works Clearinghouse Rating
Showing 1 to 15 of 1,357 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Güler Yavuz Temel – Journal of Educational Measurement, 2024
The purpose of this study was to investigate multidimensional DIF with a simple and nonsimple structure in the context of multidimensional Graded Response Model (MGRM). This study examined and compared the performance of the IRT-LR and Wald test using MML-EM and MHRM estimation approaches with different test factors and test structures in…
Descriptors: Computation, Multidimensional Scaling, Item Response Theory, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Tao Gong; Lan Shuai; Robert J. Mislevy – Journal of Educational Measurement, 2024
The usual interpretation of the person and task variables in between-persons measurement models such as item response theory (IRT) is as attributes of persons and tasks, respectively. They can be viewed instead as ensemble descriptors of patterns of interactions among persons and situations that arise from sociocognitive complex adaptive system…
Descriptors: Cognitive Processes, Item Response Theory, Social Cognition, Individualized Instruction
Peer reviewed Peer reviewed
Direct linkDirect link
Kuan-Yu Jin; Thomas Eckes – Journal of Educational Measurement, 2024
Many language proficiency tests include group oral assessments involving peer interaction. In such an assessment, examinees discuss a common topic with others. Human raters score each examinee's spoken performance on specially designed criteria. However, measurement models for analyzing group assessment data usually assume local person…
Descriptors: Peer Relationship, Interaction, Oral Language, Student Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Jianbin Fu; Xuan Tan; Patrick C. Kyllonen – Journal of Educational Measurement, 2024
This paper presents the item and test information functions of the Rank two-parameter logistic models (Rank-2PLM) for items with two (pair) and three (triplet) statements in forced-choice questionnaires. The Rank-2PLM model for pairs is the MUPP-2PLM (Multi-Unidimensional Pairwise Preference) and, for triplets, is the Triplet-2PLM. Fisher's…
Descriptors: Questionnaires, Test Items, Item Response Theory, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Sandip Sinharay; Matthew S. Johnson – Journal of Educational Measurement, 2024
Culturally responsive assessments have been proposed as potential tools to ensure equity and fairness for examinees from all backgrounds including those from traditionally underserved or minoritized groups. However, these assessments are relatively new and, with few exceptions, are yet to be implemented in large scale. Consequently, there is a…
Descriptors: Culturally Relevant Education, Evaluation, Equal Education, Disadvantaged
Peer reviewed Peer reviewed
Direct linkDirect link
Daria Gerasimova – Journal of Educational Measurement, 2024
I propose two practical advances to the argument-based approach to validity: developing a living document and incorporating preregistration. First, I present a potential structure for the living document that includes an up-to-date summary of the validity argument. As the validation process may span across multiple studies, the living document…
Descriptors: Validity, Documentation, Methods, Research Reports
Peer reviewed Peer reviewed
Direct linkDirect link
Sooyong Lee; Suhwa Han; Seung W. Choi – Journal of Educational Measurement, 2024
Research has shown that multiple-indicator multiple-cause (MIMIC) models can result in inflated Type I error rates in detecting differential item functioning (DIF) when the assumption of equal latent variance is violated. This study explains how the violation of the equal variance assumption adversely impacts the detection of nonuniform DIF and…
Descriptors: Factor Analysis, Bayesian Statistics, Test Bias, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Carmen Köhler; Lale Khorramdel; Artur Pokropek; Johannes Hartig – Journal of Educational Measurement, 2024
For assessment scales applied to different groups (e.g., students from different states; patients in different countries), multigroup differential item functioning (MG-DIF) needs to be evaluated in order to ensure that respondents with the same trait level but from different groups have equal response probabilities on a particular item. The…
Descriptors: Measures (Individuals), Test Bias, Models, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Wenchao Ma; Miguel A. Sorrel; Xiaoming Zhai; Yuan Ge – Journal of Educational Measurement, 2024
Most existing diagnostic models are developed to detect whether students have mastered a set of skills of interest, but few have focused on identifying what scientific misconceptions students possess. This article developed a general dual-purpose model for simultaneously estimating students' overall ability and the presence and absence of…
Descriptors: Models, Misconceptions, Diagnostic Tests, Ability
Peer reviewed Peer reviewed
Direct linkDirect link
Tae Yeon Kwon; A. Corinne Huggins-Manley; Jonathan Templin; Mingying Zheng – Journal of Educational Measurement, 2024
In classroom assessments, examinees can often answer test items multiple times, resulting in sequential multiple-attempt data. Sequential diagnostic classification models (DCMs) have been developed for such data. As student learning processes may be aligned with a hierarchy of measured traits, this study aimed to develop a sequential hierarchical…
Descriptors: Classification, Accuracy, Student Evaluation, Sequential Approach
Peer reviewed Peer reviewed
Direct linkDirect link
Mahmood Ul Hassan; Frank Miller – Journal of Educational Measurement, 2024
Multidimensional achievement tests are recently gaining more importance in educational and psychological measurements. For example, multidimensional diagnostic tests can help students to determine which particular domain of knowledge they need to improve for better performance. To estimate the characteristics of candidate items (calibration) for…
Descriptors: Multidimensional Scaling, Achievement Tests, Test Items, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Sun-Joo Cho; Amanda Goodwin; Matthew Naveiras; Jorge Salas – Journal of Educational Measurement, 2024
Despite the growing interest in incorporating response time data into item response models, there has been a lack of research investigating how the effect of speed on the probability of a correct response varies across different groups (e.g., experimental conditions) for various items (i.e., differential response time item analysis). Furthermore,…
Descriptors: Item Response Theory, Reaction Time, Models, Accuracy
Peer reviewed Peer reviewed
Direct linkDirect link
Jia Liu; Xiangbin Meng; Gongjun Xu; Wei Gao; Ningzhong Shi – Journal of Educational Measurement, 2024
In this paper, we develop a mixed stochastic approximation expectation-maximization (MSAEM) algorithm coupled with a Gibbs sampler to compute the marginalized maximum a posteriori estimate (MMAPE) of a confirmatory multidimensional four-parameter normal ogive (M4PNO) model. The proposed MSAEM algorithm not only has the computational advantages of…
Descriptors: Algorithms, Achievement Tests, Foreign Countries, International Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Gorney, Kylie; Wollack, James A. – Journal of Educational Measurement, 2023
In order to detect a wide range of aberrant behaviors, it can be useful to incorporate information beyond the dichotomous item scores. In this paper, we extend the l[subscript z] and l*[subscript z] person-fit statistics so that unusual behavior in item scores and unusual behavior in item distractors can be used as indicators of aberrance. Through…
Descriptors: Test Items, Scores, Goodness of Fit, Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Rae Yeong; Yoo, Yun Joo – Journal of Educational Measurement, 2023
In cognitive diagnostic models (CDMs), a set of fine-grained attributes is required to characterize complex problem solving and provide detailed diagnostic information about an examinee. However, it is challenging to ensure reliable estimation and control computational complexity when The test aims to identify the examinee's attribute profile in a…
Descriptors: Models, Diagnostic Tests, Adaptive Testing, Accuracy
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  91