Publication Date
| In 2024 | 16 |
| Since 2023 | 39 |
| Since 2020 (last 5 years) | 144 |
| Since 2015 (last 10 years) | 314 |
| Since 2005 (last 20 years) | 494 |
Descriptor
| Testing Problems | 4806 |
| Elementary Secondary Education | 1257 |
| Test Validity | 996 |
| Test Construction | 797 |
| Standardized Tests | 785 |
| Higher Education | 655 |
| Test Reliability | 593 |
| Student Evaluation | 580 |
| Test Bias | 561 |
| Testing | 557 |
| Achievement Tests | 550 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 247 |
| Researchers | 218 |
| Teachers | 80 |
| Administrators | 35 |
| Policymakers | 33 |
| Parents | 15 |
| Counselors | 13 |
| Students | 5 |
| Community | 3 |
| Support Staff | 2 |
Location
| Canada | 52 |
| California | 44 |
| Australia | 43 |
| United Kingdom | 35 |
| United States | 32 |
| United Kingdom (England) | 31 |
| Netherlands | 26 |
| Florida | 25 |
| New York | 25 |
| United Kingdom (Great Britain) | 24 |
| China | 23 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards with or without Reservations | 1 |
Wind, Stefanie A. – Educational and Psychological Measurement, 2023
Rating scale analysis techniques provide researchers with practical tools for examining the degree to which ordinal rating scales (e.g., Likert-type scales or performance assessment rating scales) function in psychometrically useful ways. When rating scales function as expected, researchers can interpret ratings in the intended direction (i.e.,…
Descriptors: Rating Scales, Testing Problems, Item Response Theory, Models
Carlos Cinelli; Andrew Forney; Judea Pearl – Sociological Methods & Research, 2024
Many students of statistics and econometrics express frustration with the way a problem known as "bad control" is treated in the traditional literature. The issue arises when the addition of a variable to a regression equation produces an unintended discrepancy between the regression coefficient and the effect that the coefficient is…
Descriptors: Regression (Statistics), Robustness (Statistics), Error of Measurement, Testing Problems
Lewis, Jennifer; Sireci, Stephen G. – Educational Measurement: Issues and Practice, 2022
This module is designed for educators, educational researchers, and psychometricians who would like to develop an understanding of the basic concepts of validity theory, test validation, and documenting a "validity argument." It also describes how an in-depth understanding of the purposes and uses of educational tests sets the foundation…
Descriptors: Test Validity, Tests, Testing Problems, Faculty Development
Chang, Kuo-Feng – ProQuest LLC, 2022
This dissertation was designed to foster a deeper understanding of population invariance in the context of composite-score equating and provide practitioners with guidelines for addressing score equity concerns at the composite score level. The purpose of this dissertation was threefold. The first was to compare different composite equating…
Descriptors: Test Items, Equated Scores, Methods, Design
Gökhan Iskifoglu – Turkish Online Journal of Educational Technology - TOJET, 2024
This research paper investigated the importance of conducting measurement invariance analysis in developing measurement tools for assessing differences between and among study variables. Most of the studies, which tended to develop an inventory to assess the existence of an attitude, behavior, belief, IQ, or an intuition in a person's…
Descriptors: Testing, Testing Problems, Error of Measurement, Attitude Measures
James D. Weese; Ronna C. Turner; Allison Ames; Xinya Liang; Brandon Crawford – Journal of Experimental Education, 2024
In this study a standardized effect size was created for use with the SIBTEST procedure. Using this standardized effect size, a single set of heuristics was developed that are appropriate for data fitting different item response models (e.g., 2-parameter logistic, 3-parameter logistic). The standardized effect size rescales the raw beta-uni value…
Descriptors: Test Bias, Test Items, Item Response Theory, Effect Size
Paul T. von Hippel – Annenberg Institute for School Reform at Brown University, 2023
Longitudinal studies can produce biased estimates of learning if children miss tests. In an application to summer learning, we illustrate how missing test scores can create an illusion of large summer learning gaps when true gaps are close to zero. We demonstrate two methods that reduce bias by exploiting the correlations between missing and…
Descriptors: Testing Problems, Scores, Educational Research, Longitudinal Studies
Brunfaut, Tineke – Language Testing, 2023
In this invited Viewpoint on the occasion of the 40th anniversary of the journal "Language Testing," I argue that at the core of future challenges and opportunities for the field--both in scholarly and operational respects--remain basic questions and principles in language testing and assessment. Despite the high levels of sophistication…
Descriptors: Language Tests, Testing, Language Usage, Testing Problems
Pornphan Sureeyatanapas; Panitas Sureeyatanapas; Uthumporn Panitanarak; Jittima Kraisriwattana; Patchanan Sarootyanapat; Daniel O'Connell – Language Testing in Asia, 2024
Ensuring consistent and reliable scoring is paramount in education, especially in performance-based assessments. This study delves into the critical issue of marking consistency, focusing on speaking proficiency tests in English language learning, which often face greater reliability challenges. While existing literature has explored various…
Descriptors: Foreign Countries, Students, English Language Learners, Speech
Linda Borger; Stefan Johansson; Rolf Strietholt – Educational Assessment, Evaluation and Accountability, 2024
PISA aims to serve as a "global yardstick" for educational success, as measured by student performance. For comparisons to be meaningful across countries or over time, PISA samples must be representative of the population of 15-year-old students in each country. Exclusions and non-response can undermine this representativeness and…
Descriptors: Achievement Tests, International Assessment, Foreign Countries, Secondary School Students
Coggeshall, Whitney Smiley – Educational Measurement: Issues and Practice, 2021
The continuous testing framework, where both successful and unsuccessful examinees have to demonstrate continued proficiency at frequent prespecified intervals, is a framework that is used in noncognitive assessment and is gaining in popularity in cognitive assessment. Despite the rigorous advantages of this framework, this paper demonstrates that…
Descriptors: Classification, Accuracy, Testing, Failure
Karoline A. Sachse; Sebastian Weirich; Nicole Mahler; Camilla Rjosk – International Journal of Testing, 2024
In order to ensure content validity by covering a broad range of content domains, the testing times of some educational large-scale assessments last up to a total of two hours or more. Performance decline over the course of taking the test has been extensively documented in the literature. It can occur due to increases in the numbers of: (a)…
Descriptors: Test Wiseness, Test Score Decline, Testing Problems, Foreign Countries
Kalemdaroglu-Wheeler, Elif – ProQuest LLC, 2023
The purpose of this qualitative exploratory case study was to explore teachers' and administrators' perceptions of test score pollution deriving from COVID-19-related issues that may affect students' test scores on state-mandated standardized tests for grades six through 12 in a state along the Atlantic Coast of the United States. Four research…
Descriptors: Testing Problems, Scores, COVID-19, Pandemics
Reddy, Leelakrishna; Letswalo, Machaba Leanyatsa; Sefage, Amanda Percy; Kheswa, Bonginkosi Vincent; Balakrishna, Avula; Changundega, Jesman Moreblessing; Mvelase, Mashinga Johannes; Kheswa, Khayelihle Allen; Majola, Siyabonga Ntokozo Thandoluhle; Mathe, Themba; Seakamela, Teffo; Nemakhavhani, Thendo Emmanuel – Pedagogical Research, 2022
Integrity and quality of assessments on the online platform should be upheld to ensure that it supports student learning as well as the efficacy of teaching because in the end it measures the reputation of an institution. How institutions have traversed such domains remains a grey area. This paper provides anecdotal insights into how staff from a…
Descriptors: Computer Assisted Testing, Cheating, Foreign Countries, College Faculty
Sinharay, Sandip – Educational Measurement: Issues and Practice, 2022
Administrative problems such as computer malfunction and power outage occasionally lead to missing item scores, and hence to incomplete data, on credentialing tests such as the United States Medical Licensing examination. Feinberg compared four approaches for reporting pass-fail decisions to the examinees with incomplete data on credentialing…
Descriptors: Testing Problems, High Stakes Tests, Credentials, Test Items

Peer reviewed
Direct link
