ERIC - Search Results

Publication Date

In 2024	0
Since 2023	1
Since 2020 (last 5 years)	13
Since 2015 (last 10 years)	25
Since 2005 (last 20 years)	39

Descriptor

Scoring	40
Test Items	20
Item Response Theory	19
Computation	8
Accuracy	7
Correlation	7
Error of Measurement	7
Models	7
Scores	7
Classification	6
Goodness of Fit	6
Comparative Analysis	5
Evaluators	5
Factor Analysis	5
Measures (Individuals)	5
Psychometrics	5
Reliability	5
Test Bias	5
Bias	4
Computer Assisted Testing	4
Difficulty Level	4
Item Analysis	4
Mathematics Tests	4
Maximum Likelihood Statistics	4
Measurement	4
More ▼

Source

Educational and Psychological…

Publication Type

Journal Articles	40
Reports - Research	32
Reports - Evaluative	6
Reports - Descriptive	2

Education Level

Middle Schools	5
Elementary Education	4
Secondary Education	4
Early Childhood Education	3
Grade 3	3
Grade 4	3
Junior High Schools	3
Grade 6	2
Grade 8	2
Intermediate Grades	2
Primary Education	2
Elementary Secondary Education	1
Grade 10	1
Grade 12	1
Grade 2	1
Grade 5	1
Higher Education	1
Postsecondary Education	1
Preschool Education	1
More ▼

Audience

Location

Australia	1
Canada	1
China	1
Georgia	1
Hong Kong	1
Illinois (Chicago)	1
India	1
Japan	1
North Carolina (Durham)	1
Saudi Arabia	1
South Korea	1
Taiwan	1
United Kingdom	1
United States	1
Virginia (Richmond)	1
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Center for Epidemiologic…	1
General Aptitude Test Battery	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 40 results Save | Export

Scoring Graphical Responses in TIMSS 2019 Using Artificial Neural Networks

Peer reviewed

Direct link

von Davier, Matthias; Tyack, Lillian; Khorramdel, Lale – Educational and Psychological Measurement, 2023

Automated scoring of free drawings or images as responses has yet to be used in large-scale assessments of student achievement. In this study, we propose artificial neural networks to classify these types of graphical responses from a TIMSS 2019 item. We are comparing classification accuracy of convolutional and feed-forward approaches. Our…

Descriptors: Scoring, Networks, Artificial Intelligence, Elementary Secondary Education

Polytomous Testlet Response Models for Technology-Enhanced Innovative Items: Implications on Model Fit and Trait Inference

Peer reviewed

Direct link

Kang, Hyeon-Ah; Han, Suhwa; Kim, Doyoung; Kao, Shu-Chuan – Educational and Psychological Measurement, 2022

The development of technology-enhanced innovative items calls for practical models that can describe polytomous testlet items. In this study, we evaluate four measurement models that can characterize polytomous items administered in testlets: (a) generalized partial credit model (GPCM), (b) testlet-as-a-polytomous-item model (TPIM), (c)…

Descriptors: Goodness of Fit, Item Response Theory, Test Items, Scoring

Evidence That Selecting an Appropriate Item Response Theory-Based Approach to Scoring Surveys Can Help Avoid Biased Treatment Effect Estimates

Peer reviewed

Direct link

Soland, James – Educational and Psychological Measurement, 2022

Considerable thought is often put into designing randomized control trials (RCTs). From power analyses and complex sampling designs implemented preintervention to nuanced quasi-experimental models used to estimate treatment effects postintervention, RCT design can be quite complicated. Yet when psychological constructs measured using survey scales…

Descriptors: Item Response Theory, Surveys, Scoring, Randomized Controlled Trials

Evaluating Different Scoring Methods for Multiple Response Items Providing Partial Credit

Peer reviewed

Direct link

Betts, Joe; Muntean, William; Kim, Doyoung; Kao, Shu-chuan – Educational and Psychological Measurement, 2022

The multiple response structure can underlie several different technology-enhanced item types. With the increased use of computer-based testing, multiple response items are becoming more common. This response type holds the potential for being scored polytomously for partial credit. However, there are several possible methods for computing raw…

Descriptors: Scoring, Test Items, Test Format, Raw Scores

Testing for Differential Item Functioning under the "D"-Scoring Method

Peer reviewed

Direct link

Dimitrov, Dimiter M.; Atanasov, Dimitar V. – Educational and Psychological Measurement, 2022

This study offers an approach to testing for differential item functioning (DIF) in a recently developed measurement framework, referred to as "D"-scoring method (DSM). Under the proposed approach, called "P-Z" method of testing for DIF, the item response functions of two groups (reference and focal) are compared by…

Descriptors: Test Bias, Methods, Test Items, Scoring

Detecting Rater Biases in Sparse Rater-Mediated Assessment Networks

Peer reviewed

Direct link

Wind, Stefanie A.; Ge, Yuan – Educational and Psychological Measurement, 2021

Practical constraints in rater-mediated assessments limit the availability of complete data. Instead, most scoring procedures include one or two ratings for each performance, with overlapping performances across raters or linking sets of multiple-choice items to facilitate model estimation. These incomplete scoring designs present challenges for…

Descriptors: Evaluators, Scoring, Data Collection, Design

A Polytomous Scoring Approach to Handle Not-Reached Items in Low-Stakes Assessments

Peer reviewed

Direct link

Gorgun, Guher; Bulut, Okan – Educational and Psychological Measurement, 2021

In low-stakes assessments, some students may not reach the end of the test and leave some items unanswered due to various reasons (e.g., lack of test-taking motivation, poor time management, and test speededness). Not-reached items are often treated as incorrect or not-administered in the scoring process. However, when the proportion of…

Descriptors: Scoring, Test Items, Response Style (Tests), Mathematics Tests

A Multidimensional Item Response Theory Model for Continuous and Graded Responses with Error in Persons and Items

Peer reviewed

Direct link

Ferrando, Pere J.; Navarro-González, David – Educational and Psychological Measurement, 2021

Item response theory "dual" models (DMs) in which both items and individuals are viewed as sources of differential measurement error so far have been proposed only for unidimensional measures. This article proposes two multidimensional extensions of existing DMs: the M-DTCRM (dual Thurstonian continuous response model), intended for…

Descriptors: Item Response Theory, Error of Measurement, Models, Factor Analysis

Generalized Linear Factor Score Regression: A Comparison of Four Methods

Peer reviewed

Direct link

Andersson, Gustaf; Yang-Wallentin, Fan – Educational and Psychological Measurement, 2021

Factor score regression has recently received growing interest as an alternative for structural equation modeling. However, many applications are left without guidance because of the focus on normally distributed outcomes in the literature. We perform a simulation study to examine how a selection of factor scoring methods compare when estimating…

Descriptors: Regression (Statistics), Statistical Analysis, Computation, Scoring

Can High-Dimensional Questionnaires Resolve the Ipsativity Issue of Forced-Choice Response Formats?

Peer reviewed

Direct link

Schulte, Niklas; Holling, Heinz; Bürkner, Paul-Christian – Educational and Psychological Measurement, 2021

Forced-choice questionnaires can prevent faking and other response biases typically associated with rating scales. However, the derived trait scores are often unreliable and ipsative, making interindividual comparisons in high-stakes situations impossible. Several studies suggest that these problems vanish if the number of measured traits is high.…

Descriptors: Questionnaires, Measurement Techniques, Test Format, Scoring

Latent "D"-Scoring Modeling: Estimation of Item and Person Parameters

Peer reviewed

Direct link

Dimitrov, Dimiter M.; Atanasov, Dimitar V. – Educational and Psychological Measurement, 2021

This study presents a latent (item response theory--like) framework of a recently developed classical approach to test scoring, equating, and item analysis, referred to as "D"-scoring method. Specifically, (a) person and item parameters are estimated under an item response function model on the "D"-scale (from 0 to 1) using…

Descriptors: Scoring, Equated Scores, Item Analysis, Item Response Theory

Modeling of Item Response Functions under the D-Scoring Method

Peer reviewed

Direct link

Dimitrov, Dimiter M. – Educational and Psychological Measurement, 2020

This study presents new models for item response functions (IRFs) in the framework of the D-scoring method (DSM) that is gaining attention in the field of educational and psychological measurement and largescale assessments. In a previous work on DSM, the IRFs of binary items were estimated using a logistic regression model (LRM). However, the LRM…

Descriptors: Item Response Theory, Scoring, True Scores, Scaling

Using Latent Semantic Analysis to Score Short Answer Constructed Responses: Automated Scoring of the Consequences Test

Peer reviewed

Direct link

LaVoie, Noelle; Parker, James; Legree, Peter J.; Ardison, Sharon; Kilcullen, Robert N. – Educational and Psychological Measurement, 2020

Automated scoring based on Latent Semantic Analysis (LSA) has been successfully used to score essays and constrained short answer responses. Scoring tests that capture open-ended, short answer responses poses some challenges for machine learning approaches. We used LSA techniques to score short answer responses to the Consequences Test, a measure…

Descriptors: Semantics, Evaluators, Essays, Scoring

Exploring the Impersonal Judgments and Personal Preferences of Raters in Rater-Mediated Assessments with Unfolding Models

Peer reviewed

Direct link

Wang, Jue; Engelhard, George, Jr. – Educational and Psychological Measurement, 2019

The purpose of this study is to explore the use of unfolding models for evaluating the quality of ratings obtained in rater-mediated assessments. Two different judgmental processes can be used to conceptualize ratings: impersonal judgments and personal preferences. Impersonal judgments are typically expected in rater-mediated assessments, and…

Descriptors: Evaluative Thinking, Preferences, Evaluators, Models

Exploring the Combined Effects of Rater Misfit and Differential Rater Functioning in Performance Assessments

Peer reviewed

Direct link

Wind, Stefanie A.; Guo, Wenjing – Educational and Psychological Measurement, 2019

Rater effects, or raters' tendencies to assign ratings to performances that are different from the ratings that the performances warranted, are well documented in rater-mediated assessments across a variety of disciplines. In many real-data studies of rater effects, researchers have reported that raters exhibit more than one effect, such as a…

Descriptors: Evaluators, Bias, Scoring, Data Collection

Previous Page | Next Page »

Pages: 1 | 2 | 3

Privacy | Copyright | Contact Us | Selection Policy | API

Dimitrov, Dimiter M.	6
Atanasov, Dimitar V.	2
Casabianca, Jodi M.	2
Kim, Doyoung	2
McCaffrey, Daniel F.	2
Wind, Stefanie A.	2
Wolfe, Edward W.	2
Al-Mashary, Faisal	1
Andersson, Gustaf	1
Ardison, Sharon	1
Attali, Yigal	1
Balogh, Jennifer	1
Batchelder, William H.	1
Bell, Courtney A.	1
Bernstein, Jared	1
Betts, Joe	1
Biancarosa, Gina	1
Bonanno, George A.	1
Brown, Allison R.	1
Bulut, Okan	1
Bürkner, Paul-Christian	1
Carle, Adam C.	1
Carlson, Sarah E.	1
Cervellione, Kelly L.	1
Cetin-Berber, Dee Duygu	1
More ▼