NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
No Child Left Behind Act 20011
What Works Clearinghouse Rating
Showing 1 to 15 of 341 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Huggins-Manley, Anne Corinne; Qiu, Yuxi; Penfield, Randall D. – International Journal of Testing, 2018
Score equity assessment (SEA) refers to an examination of population invariance of equating across two or more subpopulations of test examinees. Previous SEA studies have shown that score equity may be present for examinees scoring at particular test score ranges but absent for examinees scoring at other score ranges. No studies to date have…
Descriptors: Equated Scores, Test Bias, Test Items, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Wind, Stefanie A.; Wolfe, Edward W.; Engelhard, George, Jr.; Foltz, Peter; Rosenstein, Mark – International Journal of Testing, 2018
Automated essay scoring engines (AESEs) are becoming increasingly popular as an efficient method for performance assessments in writing, including many language assessments that are used worldwide. Before they can be used operationally, AESEs must be "trained" using machine-learning techniques that incorporate human ratings. However, the…
Descriptors: Computer Assisted Testing, Essay Tests, Writing Evaluation, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Tsaousis, Ioannis; Sideridis, Georgios; Al-Saawi, Fahad – International Journal of Testing, 2018
The aim of the present study was to examine Differential Distractor Functioning (DDF) as a means of improving the quality of a measure through understanding biased responses across groups. A DDF analysis could shed light on the potential sources of construct-irrelevant variance by examining whether the differential selection of incorrect choices…
Descriptors: Foreign Countries, College Entrance Examinations, Test Bias, Chemistry
Peer reviewed Peer reviewed
Direct linkDirect link
Sen, Sedat – International Journal of Testing, 2018
Recent research has shown that over-extraction of latent classes can be observed in the Bayesian estimation of the mixed Rasch model when the distribution of ability is non-normal. This study examined the effect of non-normal ability distributions on the number of latent classes in the mixed Rasch model when estimated with maximum likelihood…
Descriptors: Item Response Theory, Comparative Analysis, Computation, Maximum Likelihood Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Cao, Mengyang; Song, Q. Chelsea; Tay, Louis – International Journal of Testing, 2018
There is a growing use of noncognitive assessments around the world, and recent research has posited an ideal point response process underlying such measures. A critical issue is whether the typical use of dominance approaches (e.g., average scores, factor analysis, and the Samejima's graded response model) in scoring such measures is adequate.…
Descriptors: Comparative Analysis, Item Response Theory, Factor Analysis, Models
Peer reviewed Peer reviewed
Direct linkDirect link
International Journal of Testing, 2018
The second edition of the International Test Commission Guidelines for Translating and Adapting Tests was prepared between 2005 and 2015 to improve upon the first edition, and to respond to advances in testing technology and practices. The 18 guidelines are organized into six categories to facilitate their use: pre-condition (3), test development…
Descriptors: Translation, Test Construction, Testing, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Martin-Raugh, Michelle P.; Anguiano-Carrsaco, Cristina; Jackson, Teresa; Brenneman, Meghan W.; Carney, Lauren; Barnwell, Patrick; Kochert, Jonathan – International Journal of Testing, 2018
Single-response situational judgment tests (SRSJTs) differ from multiple-response SJTs (MRSJTS) in that they present test takers with edited critical incidents and simply ask test takers to read over the action described and evaluate it according to its effectiveness. Research comparing the reliability and validity of SRSJTs and MRSJTs is thus far…
Descriptors: Test Format, Test Reliability, Test Validity, Predictive Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Man, Kaiwen; Harring, Jeffery R.; Ouyang, Yunbo; Thomas, Sarah L. – International Journal of Testing, 2018
Many important high-stakes decisions--college admission, academic performance evaluation, and even job promotion--depend on accurate and reliable scores from valid large-scale assessments. However, examinees sometimes cheat by copying answers from other test-takers or practicing with test items ahead of time, which can undermine the effectiveness…
Descriptors: Reaction Time, High Stakes Tests, Test Wiseness, Cheating
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Sohee; Cole, Ki Lynn; Mwavita, Mwarumba – International Journal of Testing, 2018
This study investigated the effects of linking potentially multidimensional test forms using the fixed item parameter calibration. Forms had equal or unequal total test difficulty with and without confounding difficulty. The mean square errors and bias of estimated item and ability parameters were compared across the various confounding tests. The…
Descriptors: Test Items, Item Response Theory, Test Format, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Aksu Dunya, Beyza – International Journal of Testing, 2018
This study was conducted to analyze potential item parameter drift (IPD) impact on person ability estimates and classification accuracy when drift affects an examinee subgroup. Using a series of simulations, three factors were manipulated: (a) percentage of IPD items in the CAT exam, (b) percentage of examinees affected by IPD, and (c) item pool…
Descriptors: Adaptive Testing, Classification, Accuracy, Computer Assisted Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Holmes, Stephen D.; Meadows, Michelle; Stockford, Ian; He, Qingping – International Journal of Testing, 2018
The relationship of expected and actual difficulty of items on six mathematics question papers designed for 16-year olds in England was investigated through paired comparison using experts and testing with students. A variant of the Rasch model was applied to the comparison data to establish a scale of expected difficulty. In testing, the papers…
Descriptors: Foreign Countries, Secondary School Students, Mathematics Tests, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Finney, Sara J.; Myers, Aaron J.; Mathers, Catherine E. – International Journal of Testing, 2018
Assessment specialists expend a great deal of energy to promote valid inferences from test scores gathered in low-stakes testing contexts. Given the indirect effect of perceived test importance on test performance via examinee effort, assessment practitioners have manipulated test instructions with the goal of increasing perceived test importance.…
Descriptors: Testing, Test Construction, Performance Factors, Guidelines
Peer reviewed Peer reviewed
Direct linkDirect link
Papageorgiou, Spiros; Choi, Ikkyu – International Journal of Testing, 2018
This study examined whether reporting subscores for groups of items within a test section assessing a second-language modality (specifically reading or listening comprehension) added value from a measurement perspective to the information already provided by the section scores. We analyzed the responses of 116,489 test takers to reading and…
Descriptors: Second Language Learning, Second Language Instruction, English (Second Language), Language Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Smith, Patriann; Frazier, Paul; Lee, Jaehoon; Chang, Rong – International Journal of Testing, 2018
Previous research has primarily addressed the effects of language on the Program for International Student Assessment (PISA) mathematics and science assessments. More recent research has focused on the effects of language on PISA reading comprehension and literacy assessments on student populations in specific Organization for Economic Cooperation…
Descriptors: Achievement Tests, Foreign Countries, International Assessment, Secondary School Students
Peer reviewed Peer reviewed
Direct linkDirect link
Wu, Amery D.; Chen, Michelle Y.; Stone, Jake E. – International Journal of Testing, 2018
This article investigates how test-takers change their strategies to handle increased test difficulty. An adult sample reported their test-taking strategies immediately after completing the tasks in a reading test. Data were analyzed using structural equation modeling specifying a measurement-invariant, ability-moderated, latent transition…
Descriptors: Test Wiseness, Reading Tests, Reading Comprehension, Difficulty Level
Previous Page | Next Page ยป
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  23