Publication Date
| In 2015 | 1 |
| Since 2014 | 7 |
| Since 2011 (last 5 years) | 19 |
| Since 2006 (last 10 years) | 39 |
| Since 1996 (last 20 years) | 46 |
Descriptor
| Scores | 46 |
| Foreign Countries | 17 |
| Comparative Analysis | 16 |
| Item Response Theory | 11 |
| Measurement | 10 |
| Testing | 10 |
| Test Items | 9 |
| Validity | 9 |
| Evaluation Methods | 8 |
| Measures (Individuals) | 8 |
| More ▼ | |
Source
| International Journal of… | 46 |
Author
| Oliveri, Maria Elena | 3 |
| Zumbo, Bruno D. | 3 |
| Cascallar, Alicia S. | 2 |
| Dorans, Neil J. | 2 |
| Ercikan, Kadriye | 2 |
| Talento-Miller, Eileen | 2 |
| Abdelfattah, Faisal | 1 |
| Allalouf, Avi | 1 |
| Anagnostopoulos, D. C. | 1 |
| Aryadoust, Vahid | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 46 |
| Reports - Research | 24 |
| Reports - Evaluative | 15 |
| Reports - Descriptive | 6 |
| Opinion Papers | 1 |
Education Level
| Higher Education | 9 |
| Postsecondary Education | 5 |
| Secondary Education | 5 |
| Elementary Education | 3 |
| High Schools | 3 |
| Grade 4 | 2 |
| Grade 8 | 2 |
| Adult Education | 1 |
| Grade 12 | 1 |
| Intermediate Grades | 1 |
| More ▼ | |
Audience
Showing 1 to 15 of 46 results
Baghaei, Purya; Aryadoust, Vahid – International Journal of Testing, 2015
Research shows that test method can exert a significant impact on test takers' performance and thereby contaminate test scores. We argue that common test method can exert the same effect as common stimuli and violate the conditional independence assumption of item response theory models because, in general, subsets of items which have a…
Descriptors: Test Format, Item Response Theory, Models, Test Items
Sinharay, Sandip; Haberman, Shelby J. – International Journal of Testing, 2014
Recently there has been an increasing level of interest in subtest scores, or subscores, for their potential diagnostic value. Haberman (2008) suggested a method to determine if a subscore has added value over the total score. Researchers have often been interested in the performance of subgroups--for example, those based on gender or…
Descriptors: Scores, Achievement Tests, Language Tests, English (Second Language)
Oliveri, Maria Elena; von Davier, Matthias – International Journal of Testing, 2014
In this article, we investigate the creation of comparable score scales across countries in international assessments. We examine potential improvements to current score scale calibration procedures used in international large-scale assessments. Our approach seeks to improve fairness in scoring international large-scale assessments, which often…
Descriptors: Test Bias, Scores, International Programs, Educational Assessment
Fischer, Sebastian; Freund, Philipp Alexander – International Journal of Testing, 2014
The Adaption-Innovation Inventory (AII), originally developed by Kirton (1976), is a widely used self-report instrument for measuring problem-solving styles at work. The present study investigates how scores on the AII are affected by different response styles. Data are collected from a combined sample (N = 738) of students, employees, and…
Descriptors: Measures (Individuals), Scores, Item Response Theory, Response Style (Tests)
Allalouf, Avi – International Journal of Testing, 2014
The Quality Control (QC) Guidelines are intended to increase the efficiency, precision, and accuracy of the scoring, analysis, and reporting process of testing. The QC Guidelines focus on large-scale testing operations where multiple forms of tests are created for use on set dates. However, they may also be used for a wide variety of other testing…
Descriptors: Quality Control, Scoring, Test Theory, Scores
Lee, HyeSun; Geisinger, Kurt F. – International Journal of Testing, 2014
Differential item functioning (DIF) analysis is important in terms of test fairness. While DIF analyses have mainly been conducted with manifest grouping variables, such as gender or race/ethnicity, it has been recently claimed that not only the grouping variables but also contextual variables pertaining to examinees should be considered in DIF…
Descriptors: Test Bias, Gender Differences, Regression (Statistics), Statistical Analysis
Engelhard, George, Jr.; Kobrin, Jennifer L.; Wind, Stefanie A. – International Journal of Testing, 2014
The purpose of this study is to explore patterns in model-data fit related to subgroups of test takers from a large-scale writing assessment. Using data from the SAT, a calibration group was randomly selected to represent test takers who reported that English was their best language from the total population of test takers (N = 322,011). A…
Descriptors: College Entrance Examinations, Writing Tests, Goodness of Fit, English
DeMars, Christine E. – International Journal of Testing, 2013
This tutorial addresses possible sources of confusion in interpreting trait scores from the bifactor model. The bifactor model may be used when subscores are desired, either for formative feedback on an achievement test or for theoretically different constructs on a psychological test. The bifactor model is often chosen because it requires fewer…
Descriptors: Test Interpretation, Scores, Models, Correlation
Talento-Miller, Eileen; Guo, Fanmin; Han, Kyung T. – International Journal of Testing, 2013
When power tests include a time limit, it is important to assess the possibility of speededness for examinees. Past research on differential speededness has examined gender and ethnic subgroups in the United States on paper and pencil tests. When considering the needs of a global audience, research regarding different native language speakers is…
Descriptors: Adaptive Testing, Computer Assisted Testing, English, Scores
Sandilands, Debra; Oliveri, Maria Elena; Zumbo, Bruno D.; Ercikan, Kadriye – International Journal of Testing, 2013
International large-scale assessments of achievement often have a large degree of differential item functioning (DIF) between countries, which can threaten score equivalence and reduce the validity of inferences based on comparisons of group performances. It is important to understand potential sources of DIF to improve the validity of future…
Descriptors: Validity, Measures (Individuals), International Studies, Foreign Countries
Kolen, Michael J.; Wang, Tianyou; Lee, Won-Chan – International Journal of Testing, 2012
Composite scores are often formed from test scores on educational achievement test batteries to provide a single index of achievement over two or more content areas or two or more item types on that test. Composite scores are subject to measurement error, and as with scores on individual tests, the amount of error variability typically depends on…
Descriptors: Mathematics Tests, Achievement Tests, College Entrance Examinations, Error of Measurement
Dodeen, Hamzeh; Abdelfattah, Faisal; Shumrani, Saleh; Hilal, Maher Abu – International Journal of Testing, 2012
This study focused on comparing mathematics teachers' qualifications, practices, and perceptions between Saudi and Taiwanese schools. Data analyzed in this study were the responses of mathematics teachers to the Teacher Background Questionnaire--8th Grade from the Trends in International Mathematics and Science Study (TIMSS) in 2007. The Saudi…
Descriptors: Grade 8, Teacher Background, Mathematics Teachers, Educational Environment
Oliveri, Maria Elena; Olson, Brent F.; Ercikan, Kadriye; Zumbo, Bruno D. – International Journal of Testing, 2012
In this study, the Canadian English and French versions of the Problem-Solving Measure of the Programme for International Student Assessment 2003 were examined to investigate their degree of measurement comparability at the item- and test-levels. Three methods of differential item functioning (DIF) were compared: parametric and nonparametric item…
Descriptors: Foreign Students, Test Bias, Speech Communication, Effect Size
Kruyen, Peter M.; Emons, Wilco H. M.; Sijtsma, Klaas – International Journal of Testing, 2012
Personnel selection shows an enduring need for short stand-alone tests consisting of, say, 5 to 15 items. Despite their efficiency, short tests are more vulnerable to measurement error than longer test versions. Consequently, the question arises to what extent reducing test length deteriorates decision quality due to increased impact of…
Descriptors: Measurement, Personnel Selection, Decision Making, Error of Measurement
Xu, Lihua; Barnes, Laura L. B. – International Journal of Testing, 2011
Measurement invariance of the 8-factor Inventory of School Motivation (McInerney & Sinclair, 1991) between American and Chinese college students was tested using single-group and multi-group confirmatory factor analysis. A Mandarin Chinese version of the ISM was developed for this study. Comparisons of latent means were conducted when warranted by…
Descriptors: College Students, Factor Analysis, Positive Reinforcement, Mandarin Chinese

Peer reviewed
Direct link
