Publication Date
| In 2015 | 0 |
| Since 2014 | 4 |
| Since 2011 (last 5 years) | 12 |
| Since 2006 (last 10 years) | 14 |
| Since 1996 (last 20 years) | 14 |
Descriptor
| Test Items | 8 |
| Grade 8 | 7 |
| Mathematics Tests | 6 |
| Multiple Choice Tests | 5 |
| Academic Achievement | 4 |
| Difficulty Level | 4 |
| Grade 5 | 4 |
| Item Response Theory | 4 |
| Measurement | 4 |
| Reading Tests | 4 |
| More ▼ | |
Source
| Applied Measurement in… | 14 |
Author
| Andrich, David | 1 |
| Ayala, Carlos C. | 1 |
| Banks, Kathleen | 1 |
| Beddow, Peter A. | 1 |
| Bolt, Daniel M. | 1 |
| Brandon, Paul R. | 1 |
| Brookhart, Susan M. | 1 |
| Cho, Hyun-Jeong | 1 |
| Cor, M. Kenneth | 1 |
| Cui, Ying | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 14 |
| Reports - Research | 10 |
| Reports - Evaluative | 4 |
Education Level
| Middle Schools | 14 |
| Grade 8 | 9 |
| Junior High Schools | 9 |
| Secondary Education | 9 |
| Elementary Education | 6 |
| Elementary Secondary Education | 6 |
| Grade 5 | 5 |
| High Schools | 5 |
| Grade 7 | 3 |
| Intermediate Grades | 3 |
| More ▼ | |
Audience
Showing all 14 results
Noble, Tracy; Rosebery, Ann; Suarez, Catherine; Warren, Beth; O'Connor, Mary Catherine – Applied Measurement in Education, 2014
English language learners (ELLs) and their teachers, schools, and communities face increasingly high-stakes consequences due to test score gaps between ELLs and non-ELLs. It is essential that the field of educational assessment continue to investigate the meaning of these test score gaps. This article discusses the findings of an exploratory study…
Descriptors: English Language Learners, Evidence, Educational Assessment, Achievement Gap
Michaelides, Michalis P.; Haertel, Edward H. – Applied Measurement in Education, 2014
The standard error of equating quantifies the variability in the estimation of an equating function. Because common items for deriving equated scores are treated as fixed, the only source of variability typically considered arises from the estimation of common-item parameters from responses of samples of examinees. Use of alternative, equally…
Descriptors: Equated Scores, Test Items, Sampling, Statistical Inference
Humphry, Stephen; Heldsinger, Sandra; Andrich, David – Applied Measurement in Education, 2014
One of the best-known methods for setting a benchmark standard on a test is that of Angoff and its modifications. When scored dichotomously, judges estimate the probability that a benchmark student has of answering each item correctly. As in most methods of standard setting, it is assumed implicitly that the unit of the latent scale of the…
Descriptors: Foreign Countries, Standard Setting (Scoring), Judges, Item Response Theory
Rutkowski, Leslie – Applied Measurement in Education, 2014
Large-scale assessment programs such as the National Assessment of Educational Progress (NAEP), Trends in International Mathematics and Science Study (TIMSS), and Programme for International Student Assessment (PISA) use a sophisticated assessment administration design called matrix sampling that minimizes the testing burden on individual…
Descriptors: Measurement, Testing, Item Sampling, Computation
Taylor, Melinda Ann; Pastor, Dena A. – Applied Measurement in Education, 2013
Although federal regulations require testing students with severe cognitive disabilities, there is little guidance regarding how technical quality should be established. It is known that challenges exist with documentation of the reliability of scores for alternate assessments. Typical measures of reliability do little in modeling multiple sources…
Descriptors: Generalizability Theory, Alternative Assessment, Test Reliability, Scores
Wan, Lei; Henly, George A. – Applied Measurement in Education, 2012
Many innovative item formats have been proposed over the past decade, but little empirical research has been conducted on their measurement properties. This study examines the reliability, efficiency, and construct validity of two innovative item formats--the figural response (FR) and constructed response (CR) formats used in a K-12 computerized…
Descriptors: Test Items, Test Format, Computer Assisted Testing, Measurement
Banks, Kathleen – Applied Measurement in Education, 2012
The purpose of this article is to illustrate a seven-step process for determining whether inferential reading items were more susceptible to cultural bias than literal reading items. The seven-step process was demonstrated using multiple-choice data from the reading portion of a reading/language arts test for fifth and seventh grade Hispanic,…
Descriptors: Reading Tests, Test Items, Standardized Tests, Test Bias
Cho, Hyun-Jeong; Lee, Jaehoon; Kingston, Neal – Applied Measurement in Education, 2012
This study examined the validity of test accommodation in third-eighth graders using differential item functioning (DIF) and mixture IRT models. Two data sets were used for these analyses. With the first data set (N = 51,591) we examined whether item type (i.e., story, explanation, straightforward) or item features were associated with item…
Descriptors: Testing Accommodations, Test Bias, Item Response Theory, Validity
Wolf, Mikyung Kim; Kim, Jinok; Kao, Jenny – Applied Measurement in Education, 2012
Glossary and reading aloud test items are commonly allowed in many states' accommodation policies for English language learner (ELL) students for large-scale mathematics assessments. However, little research is available regarding the effects of these accommodations on ELL students' performance. Further, no research exists that examines how…
Descriptors: Testing Accommodations, Glossaries, Reading Aloud to Others, Validity
Lee, Hee-Sun; Liu, Ou Lydia; Linn, Marcia C. – Applied Measurement in Education, 2011
This study explores measurement of a construct called knowledge integration in science using multiple-choice and explanation items. We use construct and instructional validity evidence to examine the role multiple-choice and explanation items plays in measuring students' knowledge integration ability. For construct validity, we analyze item…
Descriptors: Knowledge Level, Construct Validity, Validity, Scaffolding (Teaching Technique)
Leighton, Jacqueline P.; Heffernan, Colleen; Cor, M. Kenneth; Gokiert, Rebecca J.; Cui, Ying – Applied Measurement in Education, 2011
The "Standards for Educational and Psychological Testing" indicate that test instructions, and by extension item objectives, presented to examinees should be sufficiently clear and detailed to help ensure that they respond as developers intend them to respond (Standard 3.20; AERA, APA, & NCME, 1999). The present study investigates the use of…
Descriptors: Test Construction, Validity, Evidence, Science Tests
Kettler, Ryan J.; Rodriguez, Michael C.; Bolt, Daniel M.; Elliott, Stephen N.; Beddow, Peter A.; Kurz, Alexander – Applied Measurement in Education, 2011
Federal policy on alternate assessment based on modified academic achievement standards (AA-MAS) inspired this research. Specifically, an experimental study was conducted to determine whether tests composed of modified items would have the same level of reliability as tests composed of original items, and whether these modified items helped reduce…
Descriptors: Multiple Choice Tests, Test Items, Alternative Assessment, Test Reliability
Furtak, Erin Marie; Ruiz-Primo, Maria Araceli; Shemwell, Jonathan T.; Ayala, Carlos C.; Brandon, Paul R.; Shavelson, Richard J.; Yin, Yue – Applied Measurement in Education, 2008
Given the current emphasis on conducting high-quality experimental studies, it is becoming increasingly important for researchers to accompany their studies with evaluations of the "fidelity of implementation" of the experimental treatments. This article compares the form and extent of an experimental treatment to student learning. The study…
Descriptors: Formative Evaluation, Academic Achievement, Physical Sciences, Science Teachers
Brookhart, Susan M.; Walsh, Janet M.; Zientarski, Wayne A. – Applied Measurement in Education, 2006
Motivation and effort patterns associated with achievement on classroom assessments in middle-school science and social studies were studied with a sample of 223 8th graders in different classroom assessment environments. Classroom assessment environments were characterized by student perceptions of the importance and value of assessment tasks,…
Descriptors: Student Motivation, Educational Assessment, Middle Schools, Science Education

Peer reviewed
Direct link
