Publication Date
| In 2015 | 0 |
| Since 2014 | 2 |
| Since 2011 (last 5 years) | 9 |
| Since 2006 (last 10 years) | 15 |
| Since 1996 (last 20 years) | 16 |
Descriptor
| Grade 5 | 12 |
| Test Items | 10 |
| Comparative Analysis | 5 |
| Item Response Theory | 5 |
| Test Bias | 5 |
| Grade 3 | 4 |
| Grade 8 | 4 |
| Mathematics Tests | 4 |
| Science Tests | 4 |
| Scoring | 4 |
| More ▼ | |
Source
| Applied Measurement in… | 16 |
Author
| Banks, Kathleen | 2 |
| Hein, Serge F. | 2 |
| Awuor, Risper | 1 |
| Cho, Hyun-Jeong | 1 |
| D'Agostino, Jerome V. | 1 |
| Eastwood, Melissa | 1 |
| Gattamorta, Karina A. | 1 |
| Hambleton, Ronald K. | 1 |
| Henly, George A. | 1 |
| Janssen, Rianne | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 16 |
| Reports - Research | 12 |
| Reports - Evaluative | 4 |
Education Level
| Grade 5 | 16 |
| Elementary Education | 10 |
| Elementary Secondary Education | 6 |
| Grade 3 | 6 |
| Grade 8 | 6 |
| Grade 4 | 5 |
| Intermediate Grades | 5 |
| Middle Schools | 5 |
| Grade 6 | 4 |
| Grade 7 | 3 |
| More ▼ | |
Audience
Showing 1 to 15 of 16 results
Noble, Tracy; Rosebery, Ann; Suarez, Catherine; Warren, Beth; O'Connor, Mary Catherine – Applied Measurement in Education, 2014
English language learners (ELLs) and their teachers, schools, and communities face increasingly high-stakes consequences due to test score gaps between ELLs and non-ELLs. It is essential that the field of educational assessment continue to investigate the meaning of these test score gaps. This article discusses the findings of an exploratory study…
Descriptors: English Language Learners, Evidence, Educational Assessment, Achievement Gap
Welsh, Megan E.; Eastwood, Melissa; D'Agostino, Jerome V. – Applied Measurement in Education, 2014
Teacher and school accountability systems based on high-stakes tests are ubiquitous throughout the United States and appear to be growing as a catalyst for reform. As a result, educators have increased the proportion of instructional time devoted to test preparation. Although guidelines for what constitutes appropriate and inappropriate test…
Descriptors: High Stakes Tests, Instruction, Test Preparation, Grade 3
Taylor, Melinda Ann; Pastor, Dena A. – Applied Measurement in Education, 2013
Although federal regulations require testing students with severe cognitive disabilities, there is little guidance regarding how technical quality should be established. It is known that challenges exist with documentation of the reliability of scores for alternate assessments. Typical measures of reliability do little in modeling multiple sources…
Descriptors: Generalizability Theory, Alternative Assessment, Test Reliability, Scores
Gattamorta, Karina A.; Penfield, Randall D. – Applied Measurement in Education, 2012
The study of measurement invariance in polytomous items that targets individual score levels is known as differential step functioning (DSF). The analysis of DSF requires the creation of a set of dichotomizations of the item response variable. There are two primary approaches for creating the set of dichotomizations to conduct a DSF analysis: the…
Descriptors: Measurement, Item Response Theory, Test Bias, Test Items
Kachchaf, Rachel; Solano-Flores, Guillermo – Applied Measurement in Education, 2012
We examined how rater language background affects the scoring of short-answer, open-ended test items in the assessment of English language learners (ELLs). Four native English and four native Spanish-speaking certified bilingual teachers scored 107 responses of fourth- and fifth-grade Spanish-speaking ELLs to mathematics items administered in…
Descriptors: Error of Measurement, English Language Learners, Scoring, Bilingual Teachers
Wan, Lei; Henly, George A. – Applied Measurement in Education, 2012
Many innovative item formats have been proposed over the past decade, but little empirical research has been conducted on their measurement properties. This study examines the reliability, efficiency, and construct validity of two innovative item formats--the figural response (FR) and constructed response (CR) formats used in a K-12 computerized…
Descriptors: Test Items, Test Format, Computer Assisted Testing, Measurement
Banks, Kathleen – Applied Measurement in Education, 2012
The purpose of this article is to illustrate a seven-step process for determining whether inferential reading items were more susceptible to cultural bias than literal reading items. The seven-step process was demonstrated using multiple-choice data from the reading portion of a reading/language arts test for fifth and seventh grade Hispanic,…
Descriptors: Reading Tests, Test Items, Standardized Tests, Test Bias
Cho, Hyun-Jeong; Lee, Jaehoon; Kingston, Neal – Applied Measurement in Education, 2012
This study examined the validity of test accommodation in third-eighth graders using differential item functioning (DIF) and mixture IRT models. Two data sets were used for these analyses. With the first data set (N = 51,591) we examined whether item type (i.e., story, explanation, straightforward) or item features were associated with item…
Descriptors: Testing Accommodations, Test Bias, Item Response Theory, Validity
Van Nijlen, Daniel; Janssen, Rianne – Applied Measurement in Education, 2011
The distinction between quantitative and qualitative differences in mastery is essential when monitoring student progress and is crucial for instructional interventions to deal with learning difficulties. Mixture item response theory (IRT) models can provide a convenient way to make the distinction between quantitative and qualitative differences…
Descriptors: Spelling, Indo European Languages, Vowels, Verbal Tests
Hein, Serge F.; Skaggs, Gary E. – Applied Measurement in Education, 2009
Only a small number of qualitative studies have investigated panelists' experiences during standard-setting activities or the thought processes associated with panelists' actions. This qualitative study involved an examination of the experiences of 11 panelists who participated in a prior, one-day standard-setting meeting in which either the…
Descriptors: Focus Groups, Standard Setting, Cutting Scores, Cognitive Processes
Osborn Popp, Sharon E.; Ryan, Joseph M.; Thompson, Marilyn S. – Applied Measurement in Education, 2009
Scoring rubrics are routinely used to evaluate the quality of writing samples produced for writing performance assessments, with anchor papers chosen to represent score points defined in the rubric. Although the careful selection of anchor papers is associated with best practices for scoring, little research has been conducted on the role of…
Descriptors: Writing Evaluation, Scoring Rubrics, Selection, Scoring
Kim, Do-Hong; Schneider, Christina; Siskind, Theresa – Applied Measurement in Education, 2009
This study examined the extent to which the underlying factor structure of the 2005 South Carolina Palmetto Achievement Challenge Tests (PACT) in science for grades 3, 4, and 5 was equivalent for students who were administered the test in a regular (standard) or accommodated form. Three accommodation groups were of interest: students who received…
Descriptors: Testing Accommodations, Science Tests, Elementary School Science, Measurement
Tong, Ye; Kolen, Michael J. – Applied Measurement in Education, 2007
A number of vertical scaling methodologies were examined in this article. Scaling variations included data collection design, scaling method, item response theory (IRT) scoring procedure, and proficiency estimation method. Vertical scales were developed for Grade 3 through Grade 8 for 4 content areas and 9 simulated datasets. A total of 11 scaling…
Descriptors: Achievement Tests, Scaling, Methods, Item Response Theory
Skaggs, Gary; Hein, Serge F.; Awuor, Risper – Applied Measurement in Education, 2007
In this study, a variation of the bookmark standard setting procedure for passage-based tests is proposed in which separate ordered item booklets are created for the items associated with each passage. This variation is compared to the traditional bookmark procedure for a fifth-grade reading test. The results showed that the single-passage…
Descriptors: Reading Tests, Standard Setting, Cutting Scores, Grade 5
Banks, Kathleen – Applied Measurement in Education, 2006
The purpose of this article is to present a working definition of the term "culture," as well as to describe and demonstrate a comprehensive framework for evaluating hypotheses about cultural bias in educational testing. The framework is demonstrated using 5th-grade reading and language arts data from the Terra Nova test (CTB/McGraw-Hill, 1999).…
Descriptors: Test Bias, Educational Testing, Test Items, Hispanic Americans
Previous Page | Next Page ยป
Pages: 1 | 2
Peer reviewed
Direct link
