Publication Date
| In 2015 | 0 |
| Since 2014 | 1 |
| Since 2011 (last 5 years) | 5 |
| Since 2006 (last 10 years) | 9 |
| Since 1996 (last 20 years) | 9 |
Descriptor
| Grade 7 | 6 |
| Reading Tests | 5 |
| Test Items | 5 |
| Item Response Theory | 4 |
| Mathematics Tests | 4 |
| Item Analysis | 3 |
| Multiple Choice Tests | 3 |
| Test Bias | 3 |
| Testing Accommodations | 3 |
| Achievement Tests | 2 |
| More ▼ | |
Source
| Applied Measurement in… | 9 |
Author
| Engelhard, George, Jr. | 2 |
| Lee, Yoonsun | 2 |
| Taylor, Catherine S. | 2 |
| Andrich, David | 1 |
| Ansley, Timothy | 1 |
| Banks, Kathleen | 1 |
| Cho, Hyun-Jeong | 1 |
| Domaleski, Christopher S. | 1 |
| Fincher, Melissa | 1 |
| Heldsinger, Sandra | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 9 |
| Reports - Research | 6 |
| Reports - Evaluative | 3 |
Education Level
| Grade 7 | 9 |
| Grade 4 | 5 |
| Grade 3 | 4 |
| Elementary Education | 3 |
| Elementary Secondary Education | 3 |
| Grade 5 | 3 |
| Grade 6 | 3 |
| Junior High Schools | 3 |
| Middle Schools | 3 |
| Grade 10 | 2 |
| More ▼ | |
Audience
Showing all 9 results
Humphry, Stephen; Heldsinger, Sandra; Andrich, David – Applied Measurement in Education, 2014
One of the best-known methods for setting a benchmark standard on a test is that of Angoff and its modifications. When scored dichotomously, judges estimate the probability that a benchmark student has of answering each item correctly. As in most methods of standard setting, it is assumed implicitly that the unit of the latent scale of the…
Descriptors: Foreign Countries, Standard Setting (Scoring), Judges, Item Response Theory
Banks, Kathleen – Applied Measurement in Education, 2012
The purpose of this article is to illustrate a seven-step process for determining whether inferential reading items were more susceptible to cultural bias than literal reading items. The seven-step process was demonstrated using multiple-choice data from the reading portion of a reading/language arts test for fifth and seventh grade Hispanic,…
Descriptors: Reading Tests, Test Items, Standardized Tests, Test Bias
Taylor, Catherine S.; Lee, Yoonsun – Applied Measurement in Education, 2012
This was a study of differential item functioning (DIF) for grades 4, 7, and 10 reading and mathematics items from state criterion-referenced tests. The tests were composed of multiple-choice and constructed-response items. Gender DIF was investigated using POLYSIBTEST and a Rasch procedure. The Rasch procedure flagged more items for DIF than did…
Descriptors: Test Bias, Gender Differences, Reading Tests, Mathematics Tests
Cho, Hyun-Jeong; Lee, Jaehoon; Kingston, Neal – Applied Measurement in Education, 2012
This study examined the validity of test accommodation in third-eighth graders using differential item functioning (DIF) and mixture IRT models. Two data sets were used for these analyses. With the first data set (N = 51,591) we examined whether item type (i.e., story, explanation, straightforward) or item features were associated with item…
Descriptors: Testing Accommodations, Test Bias, Item Response Theory, Validity
Engelhard, George, Jr.; Fincher, Melissa; Domaleski, Christopher S. – Applied Measurement in Education, 2011
This study examines the effects of two test administration accommodations on the mathematics performance of students within the context of a large-scale statewide assessment. The two test administration accommodations were resource guides and calculators. A stratified random sample of schools was selected to represent the demographic…
Descriptors: Testing Accommodations, Disabilities, High Stakes Tests, Program Effectiveness
Randall, Jennifer; Engelhard, George, Jr. – Applied Measurement in Education, 2010
The psychometric properties and multigroup measurement invariance of scores across subgroups, items, and persons on the "Reading for Meaning" items from the Georgia Criterion Referenced Competency Test (CRCT) were assessed in a sample of 778 seventh-grade students. Specifically, we sought to determine the extent to which score-based inferences on…
Descriptors: Testing Accommodations, Test Items, Learning Disabilities, Factor Analysis
Taylor, Catherine S.; Lee, Yoonsun – Applied Measurement in Education, 2010
Item response theory (IRT) methods are generally used to create score scales for large-scale tests. Research has shown that IRT scales are stable across groups and over time. Most studies have focused on items that are dichotomously scored. Now Rasch and other IRT models are used to create scales for tests that include polytomously scored items.…
Descriptors: Measures (Individuals), Item Response Theory, Robustness (Statistics), Item Analysis
Tong, Ye; Kolen, Michael J. – Applied Measurement in Education, 2007
A number of vertical scaling methodologies were examined in this article. Scaling variations included data collection design, scaling method, item response theory (IRT) scoring procedure, and proficiency estimation method. Vertical scales were developed for Grade 3 through Grade 8 for 4 content areas and 9 simulated datasets. A total of 11 scaling…
Descriptors: Achievement Tests, Scaling, Methods, Item Response Theory
von Schrader, Sarah; Ansley, Timothy – Applied Measurement in Education, 2006
Much has been written concerning the potential group differences in responding to multiple-choice achievement test items. This discussion has included references to possible disparities in tendency to omit such test items. When test scores are used for high-stakes decision making, even small differences in scores and rankings that arise from male…
Descriptors: Gender Differences, Multiple Choice Tests, Achievement Tests, Grade 3

Peer reviewed
Direct link
