Publication Date
| In 2015 | 2 |
| Since 2014 | 7 |
| Since 2011 (last 5 years) | 20 |
| Since 2006 (last 10 years) | 30 |
| Since 1996 (last 20 years) | 35 |
Descriptor
| Test Bias | 45 |
| Test Items | 25 |
| Item Response Theory | 11 |
| Comparative Analysis | 9 |
| Mathematics Tests | 9 |
| Evaluation Methods | 7 |
| Simulation | 7 |
| Difficulty Level | 6 |
| Item Analysis | 6 |
| Reading Tests | 6 |
| More ▼ | |
Source
| Applied Measurement in… | 45 |
Author
| Ercikan, Kadriye | 4 |
| Hambleton, Ronald K. | 4 |
| Penfield, Randall D. | 3 |
| Banks, Kathleen | 2 |
| Raju, Nambury S. | 2 |
| Su, Ya-Hui | 2 |
| Wang, Wen-Chung | 2 |
| Wells, Craig S. | 2 |
| Alvarez, Karina | 1 |
| Angoff, William H. | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 45 |
| Reports - Research | 28 |
| Reports - Evaluative | 17 |
| Information Analyses | 2 |
| Speeches/Meeting Papers | 2 |
Education Level
| Elementary Secondary Education | 6 |
| Elementary Education | 5 |
| Grade 5 | 5 |
| Grade 4 | 4 |
| Grade 3 | 3 |
| Grade 7 | 3 |
| Grade 8 | 3 |
| Higher Education | 3 |
| Postsecondary Education | 3 |
| Secondary Education | 3 |
| More ▼ | |
Audience
Showing 1 to 15 of 45 results
DeMars, Christine – Applied Measurement in Education, 2015
In generalizability theory studies in large-scale testing contexts, sometimes a facet is very sparsely crossed with the object of measurement. For example, when assessments are scored by human raters, it may not be practical to have every rater score all students. Sometimes the scoring is systematically designed such that the raters are…
Descriptors: Educational Assessment, Measurement, Data, Generalizability Theory
Suh, Youngsuk; Talley, Anna E. – Applied Measurement in Education, 2015
This study compared and illustrated four differential distractor functioning (DDF) detection methods for analyzing multiple-choice items. The log-linear approach, two item response theory-model-based approaches with likelihood ratio tests, and the odds ratio approach were compared to examine the congruence among the four DDF detection methods.…
Descriptors: Test Bias, Multiple Choice Tests, Test Items, Methods
Wells, Craig S.; Hambleton, Ronald K.; Kirkpatrick, Robert; Meng, Yu – Applied Measurement in Education, 2014
The purpose of the present study was to develop and evaluate two procedures flagging consequential item parameter drift (IPD) in an operational testing program. The first procedure was based on flagging items that exhibit a meaningful magnitude of IPD using a critical value that was defined to represent barely tolerable IPD. The second procedure…
Descriptors: Test Items, Test Bias, Equated Scores, Item Response Theory
Solano-Flores, Guillermo – Applied Measurement in Education, 2014
This article addresses validity and fairness in the testing of English language learners (ELLs)--students in the United States who are developing English as a second language. It discusses limitations of current approaches to examining the linguistic features of items and their effect on the performance of ELL students. The article submits that…
Descriptors: English Language Learners, Test Items, Probability, Test Bias
Ercikan, Kadriye; Roth, Wolff-Michael; Simon, Marielle; Sandilands, Debra; Lyons-Thomas, Juliette – Applied Measurement in Education, 2014
Diversity and heterogeneity among language groups have been well documented. Yet most fairness research that focuses on measurement comparability considers linguistic minority students such as English language learners (ELLs) or Francophone students living in minority contexts in Canada as a single group. Our focus in this research is to examine…
Descriptors: Test Bias, Language Minorities, French Canadians, Measurement
Oliveri, María Elena; Ercikan, Kadriye; Zumbo, Bruno D. – Applied Measurement in Education, 2014
Heterogeneity within English language learners (ELLs) groups has been documented. Previous research on differential item functioning (DIF) analyses suggests that accurate DIF detection rates are reduced greatly when groups are heterogeneous. In this simulation study, we investigated the effects of heterogeneity within linguistic (ELL) groups on…
Descriptors: Test Bias, Accuracy, English Language Learners, Simulation
Chia, Magda Y. – Applied Measurement in Education, 2014
The Smarter Balanced Assessment Consortium (Smarter Balanced) serves over 19 million primary, middle, and high school students from across 26 states and affiliates (Smarter Balanced, n.d). As one of the two Race to the Top (RTT)-funded assessment consortia, Smarter Balanced is responsible for developing formative, interim, and summative…
Descriptors: State Standards, Academic Standards, Educational Assessment, English Language Learners
Hickendorff, Marian – Applied Measurement in Education, 2013
The results of an exploratory study into measurement of elementary mathematics ability are presented. The focus is on the abilities involved in solving standard computation problems on the one hand and problems presented in a realistic context on the other. The objectives were to assess to what extent these abilities are shared or distinct, and…
Descriptors: Elementary School Mathematics, Mathematics Tests, Mathematics Skills, Problem Solving
Cheong, Yuk Fai; Kamata, Akihito – Applied Measurement in Education, 2013
In this article, we discuss and illustrate two centering and anchoring options available in differential item functioning (DIF) detection studies based on the hierarchical generalized linear and generalized linear mixed modeling frameworks. We compared and contrasted the assumptions of the two options, and examined the properties of their DIF…
Descriptors: Test Bias, Hierarchical Linear Modeling, Comparative Analysis, Test Items
Gattamorta, Karina A.; Penfield, Randall D. – Applied Measurement in Education, 2012
The study of measurement invariance in polytomous items that targets individual score levels is known as differential step functioning (DSF). The analysis of DSF requires the creation of a set of dichotomizations of the item response variable. There are two primary approaches for creating the set of dichotomizations to conduct a DSF analysis: the…
Descriptors: Measurement, Item Response Theory, Test Bias, Test Items
Banks, Kathleen – Applied Measurement in Education, 2012
The purpose of this article is to illustrate a seven-step process for determining whether inferential reading items were more susceptible to cultural bias than literal reading items. The seven-step process was demonstrated using multiple-choice data from the reading portion of a reading/language arts test for fifth and seventh grade Hispanic,…
Descriptors: Reading Tests, Test Items, Standardized Tests, Test Bias
Taylor, Catherine S.; Lee, Yoonsun – Applied Measurement in Education, 2012
This was a study of differential item functioning (DIF) for grades 4, 7, and 10 reading and mathematics items from state criterion-referenced tests. The tests were composed of multiple-choice and constructed-response items. Gender DIF was investigated using POLYSIBTEST and a Rasch procedure. The Rasch procedure flagged more items for DIF than did…
Descriptors: Test Bias, Gender Differences, Reading Tests, Mathematics Tests
Cho, Hyun-Jeong; Lee, Jaehoon; Kingston, Neal – Applied Measurement in Education, 2012
This study examined the validity of test accommodation in third-eighth graders using differential item functioning (DIF) and mixture IRT models. Two data sets were used for these analyses. With the first data set (N = 51,591) we examined whether item type (i.e., story, explanation, straightforward) or item features were associated with item…
Descriptors: Testing Accommodations, Test Bias, Item Response Theory, Validity
Imus, Anna; Schmitt, Neal; Kim, Brian; Oswald, Frederick L.; Merritt, Stephanie; Wrestring, Alyssa Friede – Applied Measurement in Education, 2011
Investigations of differential item functioning (DIF) have been conducted mostly on ability tests and have found little evidence of easily interpretable differences across various demographic subgroups. In this study, we examined the degree to which DIF in biographical data items referencing academically relevant background, experiences, and…
Descriptors: Test Bias, Gender Differences, Racial Differences, Biographical Inventories
Oliveri, Maria E.; Ercikan, Kadriye – Applied Measurement in Education, 2011
In this study, we examine the degree of construct comparability and possible sources of incomparability of the English and French versions of the Programme for International Student Assessment (PISA) 2003 problem-solving measure administered in Canada. Several approaches were used to examine construct comparability at the test- (examination of…
Descriptors: Foreign Countries, English, French, Tests

Peer reviewed
Direct link
