NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
No Child Left Behind Act 20011
What Works Clearinghouse Rating
Showing 1 to 15 of 330 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Huggins-Manley, Anne Corinne; Qiu, Yuxi; Penfield, Randall D. – International Journal of Testing, 2018
Score equity assessment (SEA) refers to an examination of population invariance of equating across two or more subpopulations of test examinees. Previous SEA studies have shown that score equity may be present for examinees scoring at particular test score ranges but absent for examinees scoring at other score ranges. No studies to date have…
Descriptors: Equated Scores, Test Bias, Test Items, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Wind, Stefanie A.; Wolfe, Edward W.; Engelhard, George, Jr.; Foltz, Peter; Rosenstein, Mark – International Journal of Testing, 2018
Automated essay scoring engines (AESEs) are becoming increasingly popular as an efficient method for performance assessments in writing, including many language assessments that are used worldwide. Before they can be used operationally, AESEs must be "trained" using machine-learning techniques that incorporate human ratings. However, the…
Descriptors: Computer Assisted Testing, Essay Tests, Writing Evaluation, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Tsaousis, Ioannis; Sideridis, Georgios; Al-Saawi, Fahad – International Journal of Testing, 2018
The aim of the present study was to examine Differential Distractor Functioning (DDF) as a means of improving the quality of a measure through understanding biased responses across groups. A DDF analysis could shed light on the potential sources of construct-irrelevant variance by examining whether the differential selection of incorrect choices…
Descriptors: Foreign Countries, College Entrance Examinations, Test Bias, Chemistry
Peer reviewed Peer reviewed
Direct linkDirect link
Sen, Sedat – International Journal of Testing, 2018
Recent research has shown that over-extraction of latent classes can be observed in the Bayesian estimation of the mixed Rasch model when the distribution of ability is non-normal. This study examined the effect of non-normal ability distributions on the number of latent classes in the mixed Rasch model when estimated with maximum likelihood…
Descriptors: Item Response Theory, Comparative Analysis, Computation, Maximum Likelihood Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Pishghadam, Reza; Baghaei, Purya; Seyednozadi, Zahra – International Journal of Testing, 2017
This article attempts to present emotioncy as a potential source of test bias to inform the analysis of test item performance. Emotioncy is defined as a hierarchy, ranging from "exvolvement" (auditory, visual, and kinesthetic) to "involvement" (inner and arch), to emphasize the emotions evoked by the senses. This study…
Descriptors: Test Bias, Item Response Theory, Test Items, Psychological Patterns
Peer reviewed Peer reviewed
Direct linkDirect link
Kajonius, Petri J.; Dåderman, Anna M. – International Journal of Testing, 2017
Previous research has long advocated that emotional and behavioral disorders are related to general personality traits, such as the Five Factor Model (FFM). The addition of section III in the latest "Diagnostic and Statistical Manual of Mental Disorders" (DSM) recommends that extremity in personality traits together with maladaptive…
Descriptors: Personality Problems, Empathy, Personality Traits, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Wiberg, Marie; von Davier, Alina A. – International Journal of Testing, 2017
We propose a comprehensive procedure for the implementation of a quality control process of anchor tests for a college admissions test with multiple consecutive administrations. We propose to examine the anchor tests and their items in connection with covariates to investigate if there was any unusual behavior in the anchor test results over time…
Descriptors: College Entrance Examinations, Test Items, Equated Scores, Quality Control
Peer reviewed Peer reviewed
Direct linkDirect link
Evers, Arne; McCormick, Carina M.; Hawley, Leslie R.; Muñiz, José; Balboni, Giulia; Bartram, Dave; Boben, Dusica; Egeland, Jens; El-Hassan, Karma; Fernández-Hermida, José R.; Fine, Saul; Frans, Örjan; Gintiliené, Grazina; Hagemeister, Carmen; Halama, Peter; Iliescu, Dragos; Jaworowska, Aleksandra; Jiménez, Paul; Manthouli, Marina; Matesic, Krunoslav; Michaelsen, Lars; Mogaji, Andrew; Morley-Kirk, James; Rózsa, Sándor; Rowlands, Lorraine; Schittekatte, Mark; Sümer, H. Canan; Suwartono, Tono; Urbánek, Tomáš; Wechsler, Solange; Zelenevska, Tamara; Zanev, Svetoslav; Zhang, Jianxin – International Journal of Testing, 2017
On behalf of the International Test Commission and the European Federation of Psychologists' Associations a world-wide survey on the opinions of professional psychologists on testing practices was carried out. The main objective of this study was to collect data for a better understanding of the state of psychological testing worldwide. These data…
Descriptors: Testing, Attitudes, Surveys, Psychologists
Peer reviewed Peer reviewed
Direct linkDirect link
Rios, Joseph A.; Guo, Hongwen; Mao, Liyang; Liu, Ou Lydia – International Journal of Testing, 2017
When examinees' test-taking motivation is questionable, practitioners must determine whether careless responding is of practical concern and if so, decide on the best approach to filter such responses. As there has been insufficient research on these topics, the objectives of this study were to: a) evaluate the degree of underestimation in the…
Descriptors: Response Style (Tests), Scores, Motivation, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Maeda, Hotaka; Zhang, Bo – International Journal of Testing, 2017
The omega (?) statistic is reputed to be one of the best indices for detecting answer copying on multiple choice tests, but its performance relies on the accurate estimation of copier ability, which is challenging because responses from the copiers may have been contaminated. We propose an algorithm that aims to identify and delete the suspected…
Descriptors: Cheating, Test Items, Mathematics, Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, Yi-Hsuan; Zhang, Jinming – International Journal of Testing, 2017
Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The…
Descriptors: Test Bias, Test Reliability, Performance, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Arce-Ferrer, Alvaro J.; Bulut, Okan – International Journal of Testing, 2017
This study examines separate and concurrent approaches to combine the detection of item parameter drift (IPD) and the estimation of scale transformation coefficients in the context of the common item nonequivalent groups design with the three-parameter item response theory equating. The study uses real and synthetic data sets to compare the two…
Descriptors: Item Response Theory, Equated Scores, Identification, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Lievens, Filip – International Journal of Testing, 2017
A common theme running through recent research on Situational Judgment Tests (SJTs) and this special issue is the aim to improve the measurement of constructs via SJTs. Construct-driven SJTs differ from traditional SJTs in that they present a trait activating situation to test-takers and a more unidimensional set of response options that depict…
Descriptors: Research Needs, Agenda Setting, Construct Validity, Measurement Techniques
Peer reviewed Peer reviewed
Direct linkDirect link
Prasad, Joshua J.; Showler, Morgan B.; Schmitt, Neal; Ryan, Ann Marie; Nye, Christopher D. – International Journal of Testing, 2017
The present research compares the operation of situational judgement and biodata measures between Chinese and U.S. respondents. We describe the development and past research on both measures, followed by hypothesized differences across the two groups of respondents. We base hypotheses on the nature of the Chinese and U.S. educational systems and…
Descriptors: Measures (Individuals), Hypothesis Testing, Cross Cultural Studies, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Guenole, Nigel; Chernyshenko, Oleksandr S.; Weekly, Jeff – International Journal of Testing, 2017
Situational judgment tests (SJTs) are widely agreed to be a measurement technique. It is also widely agreed that SJTs are a questionable methodological choice for measurement of psychological constructs, such as behavioral competencies, due to a lack of evidence supporting appropriate factor structures and high internal consistencies.…
Descriptors: Situational Tests, Psychological Evaluation, Test Construction, Industrial Psychology
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  22