Publication Date
| In 2015 | 2 |
| Since 2014 | 6 |
| Since 2011 (last 5 years) | 8 |
| Since 2006 (last 10 years) | 14 |
| Since 1996 (last 20 years) | 14 |
Descriptor
| Foreign Countries | 11 |
| Secondary School Students | 7 |
| Test Items | 7 |
| Mathematics Tests | 6 |
| Gender Differences | 5 |
| Item Response Theory | 5 |
| Scores | 5 |
| Test Bias | 5 |
| Comparative Analysis | 4 |
| Factor Analysis | 4 |
| More ▼ | |
Source
| International Journal of… | 14 |
Author
| Cui, Ying | 2 |
| Babenko, Oksana | 1 |
| Baumeister, Antonia E. E. | 1 |
| Beaudoin, Isabelle | 1 |
| Berberoglu, Giray | 1 |
| Bobes, Julio | 1 |
| Bulut, Okan | 1 |
| Chu, Man-Wai | 1 |
| D'Agostino, Jerome | 1 |
| Engelhard, George, Jr. | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 14 |
| Reports - Research | 9 |
| Reports - Evaluative | 4 |
| Reports - Descriptive | 1 |
Education Level
| Secondary Education | 14 |
| High Schools | 5 |
| Grade 8 | 2 |
| Grade 9 | 2 |
| Higher Education | 2 |
| Junior High Schools | 2 |
| Middle Schools | 2 |
| Postsecondary Education | 2 |
| Elementary Education | 1 |
| Grade 12 | 1 |
| More ▼ | |
Audience
Showing all 14 results
Cui, Ying; Mousavi, Amin – International Journal of Testing, 2015
The current study applied the person-fit statistic, l[subscript z], to data from a Canadian provincial achievement test to explore the usefulness of conducting person-fit analysis on large-scale assessments. Item parameter estimates were compared before and after the misfitting student responses, as identified by l[subscript z], were removed. The…
Descriptors: Measurement, Achievement Tests, Comparative Analysis, Test Items
Rindermann, Heiner; Baumeister, Antonia E. E. – International Journal of Testing, 2015
Scholastic tests regard cognitive abilities to be domain-specific competences. However, high correlations between competences indicate either high task similarity or a dependence on common factors. The present rating study examined the validity of 12 Programme for International Student Assessment (PISA) and Third or Trends in International…
Descriptors: Test Validity, Test Interpretation, Competence, Reading Tests
Kan, Adnan; Bulut, Okan – International Journal of Testing, 2014
This study investigated whether the linguistic complexity of items leads to gender differential item functioning (DIF) on mathematics assessments. Two forms of a mathematics test were developed. The first form consisted of algebra items based on mathematical expressions, terms, and equations. In the second form, the same items were written as word…
Descriptors: Gender Differences, Test Bias, Difficulty Level, Test Items
Chu, Man-Wai; Babenko, Oksana; Cui, Ying; Leighton, Jacqueline P. – International Journal of Testing, 2014
The study examines the role that perceptions or impressions of learning environments and assessments play in students' performance on a large-scale standardized test. Hierarchical linear modeling (HLM) was used to test aspects of the Learning Errors and Formative Feedback model to determine how much variation in students' performance was…
Descriptors: Hierarchical Linear Modeling, Secondary School Students, Student Attitudes, Educational Environment
Lee, HyeSun; Geisinger, Kurt F. – International Journal of Testing, 2014
Differential item functioning (DIF) analysis is important in terms of test fairness. While DIF analyses have mainly been conducted with manifest grouping variables, such as gender or race/ethnicity, it has been recently claimed that not only the grouping variables but also contextual variables pertaining to examinees should be considered in DIF…
Descriptors: Test Bias, Gender Differences, Regression (Statistics), Statistical Analysis
Engelhard, George, Jr.; Kobrin, Jennifer L.; Wind, Stefanie A. – International Journal of Testing, 2014
The purpose of this study is to explore patterns in model-data fit related to subgroups of test takers from a large-scale writing assessment. Using data from the SAT, a calibration group was randomly selected to represent test takers who reported that English was their best language from the total population of test takers (N = 322,011). A…
Descriptors: College Entrance Examinations, Writing Tests, Goodness of Fit, English
D'Agostino, Jerome; Karpinski, Aryn; Welsh, Megan – International Journal of Testing, 2011
After a test is developed, most content validation analyses shift from ascertaining domain definition to studying domain representation and relevance because the domain is assumed to be set once a test exists. We present an approach that allows for the examination of alternative domain structures based on extant test items. In our example based on…
Descriptors: Expertise, Test Items, Mathematics Tests, Factor Analysis
Svetina, Dubravka; Gorin, Joanna S.; Tatsuoka, Kikumi K. – International Journal of Testing, 2011
As a construct definition, the current study develops a cognitive model describing the knowledge, skills, and abilities measured by critical reading test items on a high-stakes assessment used for selection decisions in the United States. Additionally, in order to establish generalizability of the construct meaning to other similarly structured…
Descriptors: Reading Tests, Reading Comprehension, Critical Reading, Test Items
Lee, John Chi-kin; Yin, Hongbiao; Zhang, Zhonghua – International Journal of Testing, 2010
This article reports the adaptation and analysis of Pintrich's Motivated Strategies for Learning Questionnaire (MSLQ) in Hong Kong. First, this study examined the psychometric qualities of the existing Chinese version of MSLQ (MSLQ-CV). Based on this examination, this study developed a revised Chinese version of MSLQ (MSLQ-RCV) for junior…
Descriptors: Foreign Countries, Questionnaires, Psychometrics, Secondary School Students
Fonseca-Pedrero, Eduardo; Wells, Craig; Paino, Mercedes; Lemos-Giraldez, Serafin; Villazon-Garcia, Ursula; Sierra, Susana; Garcia-Portilla Gonzalez, Ma Paz; Bobes, Julio; Muniz, Jose – International Journal of Testing, 2010
The main objective of the present study was to examine measurement invariance of the Reynolds Depression Adolescent Scale (RADS) (Reynolds, 1987) across gender and age in a representative sample of nonclinical adolescents. The sample was composed of 1,659 participants, 801 males (48.3%), with a mean age of 15.9 years (SD = 1.2). Confirmatory…
Descriptors: Measurement Techniques, Measures (Individuals), Factor Analysis, Depression (Psychology)
Sevigny, Serge; Savard, Denis; Beaudoin, Isabelle – International Journal of Testing, 2009
Very few empirically based studies have denied or confirmed the validity of holistic score interpretations and the validity of French-English writing scores comparisons. The present study addresses these important issues. Part I investigates if adjacent holistic scores represent different writing skills. Part II evaluates if variations exposed in…
Descriptors: Writing Evaluation, Holistic Evaluation, Scores, Comparative Analysis
Le, Luc T. – International Journal of Testing, 2009
This study uses PISA cycle 3 field trial data to investigate the relationships between gender differential item functioning (DIF) across countries and test languages for science items and their formats and the four other dimensions defined in PISA framework: focus, context, competency, and scientific knowledge. The data used were collected from 60…
Descriptors: Test Bias, Gender Bias, Science Tests, Test Items
Yildirim, Huseyin Husnu; Berberoglu, Giray – International Journal of Testing, 2009
Comparisons of human characteristics across different language groups and cultures become more important in today's educational assessment practices as evidenced by the increasing interest in international comparative studies. Within this context, the fairness of the results across different language and cultural groups draws the attention of…
Descriptors: Test Bias, Cross Cultural Studies, Comparative Analysis, Factor Analysis
Schechtman, Edna; Yitzhaki, Shlomo – International Journal of Testing, 2009
The huge technological improvement in data processing and the globalization have increased the demand for and the supply of indices that quantify the consequences of a policy. However, there are certain cases in which quantification may be misleading in the sense that it gives the impression of an accurate measurement while in reality it is not.…
Descriptors: Ability, Measurement, Classification, Students

Peer reviewed
Direct link
