ERIC - Search Results

Publication Date

In 2024	0
Since 2023	0
Since 2020 (last 5 years)	2
Since 2015 (last 10 years)	2
Since 2005 (last 20 years)	3

Source

Applied Measurement in…

Author

Andrich, David	1
Bimpeh, Yaw	1
Chis, Liliana	1
Clauser, Brian E.	1
El Masri, Yasmine H.	1
Harik, Polina	1
Harrison, Liz	1
Margolis, Melissa J.	1
McManus, I. C.	1
Mollon, Jennifer	1
Pointer, William	1
Smith, Ben Alexander	1
Williams, Simon	1
More ▼

Publication Type

Journal Articles	3
Reports - Evaluative	2
Reports - Research	1

Education Level

Secondary Education

Audience

Location

United Kingdom	3
France	1
Jordan	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…

What Works Clearinghouse Rating

United Kingdom X

Showing all 3 results Save | Export

Evaluating Human Scoring Using Generalizability Theory

Peer reviewed

Direct link

Bimpeh, Yaw; Pointer, William; Smith, Ben Alexander; Harrison, Liz – Applied Measurement in Education, 2020

Many high-stakes examinations in the United Kingdom (UK) use both constructed-response items and selected-response items. We need to evaluate the inter-rater reliability for constructed-response items that are scored by humans. While there are a variety of methods for evaluating rater consistency across ratings in the psychometric literature, we…

Descriptors: Scoring, Generalizability Theory, Interrater Reliability, Foreign Countries

The Trade-Off between Model Fit, Invariance, and Validity: The Case of PISA Science Assessments

Peer reviewed

Direct link

El Masri, Yasmine H.; Andrich, David – Applied Measurement in Education, 2020

In large-scale educational assessments, it is generally required that tests are composed of items that function invariantly across the groups to be compared. Despite efforts to ensure invariance in the item construction phase, for a range of reasons (including the security of items) it is often necessary to account for differential item…

Descriptors: Models, Goodness of Fit, Test Validity, Achievement Tests

An Empirical Examination of the Impact of Group Discussion and Examinee Performance Information on Judgments Made in the Angoff Standard-Setting Procedure

Peer reviewed

Direct link

Clauser, Brian E.; Harik, Polina; Margolis, Melissa J.; McManus, I. C.; Mollon, Jennifer; Chis, Liliana; Williams, Simon – Applied Measurement in Education, 2009

Numerous studies have compared the Angoff standard-setting procedure to other standard-setting methods, but relatively few studies have evaluated the procedure based on internal criteria. This study uses a generalizability theory framework to evaluate the stability of the estimated cut score. To provide a measure of internal consistency, this…

Descriptors: Generalizability Theory, Group Discussion, Standard Setting (Scoring), Scoring

Privacy | Copyright | Contact Us | Selection Policy | API

Foreign Countries	3
Test Items	3
Generalizability Theory	2
Scoring	2
Achievement Tests	1
Computation	1
Credentials	1
Cutting Scores	1
Difficulty Level	1
Goodness of Fit	1
Group Discussion	1
High Stakes Tests	1
International Assessment	1
Interrater Reliability	1
Measurement	1
Models	1
Physicians	1
Probability	1
Program Effectiveness	1
Scientific Literacy	1
Secondary School Students	1
Standard Setting (Scoring)	1
Test Bias	1
Test Validity	1
More ▼