Publication Date
| In 2015 | 0 |
| Since 2014 | 3 |
| Since 2011 (last 5 years) | 9 |
| Since 2006 (last 10 years) | 9 |
| Since 1996 (last 20 years) | 9 |
Descriptor
| Grade 8 | 6 |
| Mathematics Tests | 6 |
| Test Items | 6 |
| Difficulty Level | 4 |
| Reading Tests | 4 |
| Computation | 3 |
| Disabilities | 3 |
| Error of Measurement | 3 |
| Grade 5 | 3 |
| Grade 7 | 3 |
| More ▼ | |
Source
| Applied Measurement in… | 9 |
Author
| Andrich, David | 1 |
| Banks, Kathleen | 1 |
| Beddow, Peter A. | 1 |
| Bolt, Daniel M. | 1 |
| Cho, Hyun-Jeong | 1 |
| Cor, M. Kenneth | 1 |
| Cui, Ying | 1 |
| Elliott, Stephen N. | 1 |
| Gokiert, Rebecca J. | 1 |
| Haertel, Edward H. | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 9 |
| Reports - Research | 8 |
| Reports - Evaluative | 1 |
Education Level
| Junior High Schools | 9 |
| Middle Schools | 9 |
| Secondary Education | 8 |
| Grade 8 | 7 |
| Elementary Secondary Education | 5 |
| Elementary Education | 4 |
| Grade 5 | 3 |
| Grade 7 | 3 |
| High Schools | 3 |
| Intermediate Grades | 2 |
| More ▼ | |
Audience
Showing all 9 results
Michaelides, Michalis P.; Haertel, Edward H. – Applied Measurement in Education, 2014
The standard error of equating quantifies the variability in the estimation of an equating function. Because common items for deriving equated scores are treated as fixed, the only source of variability typically considered arises from the estimation of common-item parameters from responses of samples of examinees. Use of alternative, equally…
Descriptors: Equated Scores, Test Items, Sampling, Statistical Inference
Humphry, Stephen; Heldsinger, Sandra; Andrich, David – Applied Measurement in Education, 2014
One of the best-known methods for setting a benchmark standard on a test is that of Angoff and its modifications. When scored dichotomously, judges estimate the probability that a benchmark student has of answering each item correctly. As in most methods of standard setting, it is assumed implicitly that the unit of the latent scale of the…
Descriptors: Foreign Countries, Standard Setting (Scoring), Judges, Item Response Theory
Rutkowski, Leslie – Applied Measurement in Education, 2014
Large-scale assessment programs such as the National Assessment of Educational Progress (NAEP), Trends in International Mathematics and Science Study (TIMSS), and Programme for International Student Assessment (PISA) use a sophisticated assessment administration design called matrix sampling that minimizes the testing burden on individual…
Descriptors: Measurement, Testing, Item Sampling, Computation
Taylor, Melinda Ann; Pastor, Dena A. – Applied Measurement in Education, 2013
Although federal regulations require testing students with severe cognitive disabilities, there is little guidance regarding how technical quality should be established. It is known that challenges exist with documentation of the reliability of scores for alternate assessments. Typical measures of reliability do little in modeling multiple sources…
Descriptors: Generalizability Theory, Alternative Assessment, Test Reliability, Scores
Banks, Kathleen – Applied Measurement in Education, 2012
The purpose of this article is to illustrate a seven-step process for determining whether inferential reading items were more susceptible to cultural bias than literal reading items. The seven-step process was demonstrated using multiple-choice data from the reading portion of a reading/language arts test for fifth and seventh grade Hispanic,…
Descriptors: Reading Tests, Test Items, Standardized Tests, Test Bias
Cho, Hyun-Jeong; Lee, Jaehoon; Kingston, Neal – Applied Measurement in Education, 2012
This study examined the validity of test accommodation in third-eighth graders using differential item functioning (DIF) and mixture IRT models. Two data sets were used for these analyses. With the first data set (N = 51,591) we examined whether item type (i.e., story, explanation, straightforward) or item features were associated with item…
Descriptors: Testing Accommodations, Test Bias, Item Response Theory, Validity
Wolf, Mikyung Kim; Kim, Jinok; Kao, Jenny – Applied Measurement in Education, 2012
Glossary and reading aloud test items are commonly allowed in many states' accommodation policies for English language learner (ELL) students for large-scale mathematics assessments. However, little research is available regarding the effects of these accommodations on ELL students' performance. Further, no research exists that examines how…
Descriptors: Testing Accommodations, Glossaries, Reading Aloud to Others, Validity
Leighton, Jacqueline P.; Heffernan, Colleen; Cor, M. Kenneth; Gokiert, Rebecca J.; Cui, Ying – Applied Measurement in Education, 2011
The "Standards for Educational and Psychological Testing" indicate that test instructions, and by extension item objectives, presented to examinees should be sufficiently clear and detailed to help ensure that they respond as developers intend them to respond (Standard 3.20; AERA, APA, & NCME, 1999). The present study investigates the use of…
Descriptors: Test Construction, Validity, Evidence, Science Tests
Kettler, Ryan J.; Rodriguez, Michael C.; Bolt, Daniel M.; Elliott, Stephen N.; Beddow, Peter A.; Kurz, Alexander – Applied Measurement in Education, 2011
Federal policy on alternate assessment based on modified academic achievement standards (AA-MAS) inspired this research. Specifically, an experimental study was conducted to determine whether tests composed of modified items would have the same level of reliability as tests composed of original items, and whether these modified items helped reduce…
Descriptors: Multiple Choice Tests, Test Items, Alternative Assessment, Test Reliability

Peer reviewed
Direct link
