NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
What Works Clearinghouse Rating
Showing 1 to 15 of 220 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Visser, Linda; Cartschau, Friederike; von Goldammer, Ariane; Brandenburg, Janin; Timmerman, Marieke; Hasselhorn, Marcus; Mähler, Claudia – Applied Measurement in Education, 2023
The growing number of children in primary schools in Germany who have German as their second language (L2) has raised questions about the fairness of performance assessment. Fair tests are a prerequisite for distinguishing between L2 learning delay and a specific learning disability. We evaluated five commonly used reading and spelling tests for…
Descriptors: Foreign Countries, Error of Measurement, Second Language Learning, German
Peer reviewed Peer reviewed
Direct linkDirect link
Poe, Mya; Oliveri, Maria Elena; Elliot, Norbert – Applied Measurement in Education, 2023
Since 1952, the "Standards for Educational and Psychological Testing" has provided criteria for developing and evaluating educational and psychological tests and testing practice. Yet, we argue that the foundations, operations, and applications in the "Standards" are no longer sufficient to meet the current U.S. testing demands…
Descriptors: Racism, Social Justice, Standards, Psychological Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Lederman, Josh – Applied Measurement in Education, 2023
Given its centrality to assessment, until the concept of validity includes concern for racial justice, such matters will be seen as residing outside the "real" work of validation, rendering them powerless to count against the apparent scientific merit of the test. As the definition of validity has evolved, however, it holds great…
Descriptors: Educational Assessment, Validity, Social Justice, Race
Peer reviewed Peer reviewed
Direct linkDirect link
Russell, Michael – Applied Measurement in Education, 2023
In recent years, issues of race, racism and social justice have garnered increased attention across the nation. Although some aspects of social justice, particularly cultural sensitivity and test bias, have received similar attention within the field of educational measurement, sharp focus of racism has alluded the field. This manuscript focuses…
Descriptors: Racism, Social Justice, Theories, Race
Peer reviewed Peer reviewed
Direct linkDirect link
Bimpeh, Yaw; Pointer, William; Smith, Ben Alexander; Harrison, Liz – Applied Measurement in Education, 2020
Many high-stakes examinations in the United Kingdom (UK) use both constructed-response items and selected-response items. We need to evaluate the inter-rater reliability for constructed-response items that are scored by humans. While there are a variety of methods for evaluating rater consistency across ratings in the psychometric literature, we…
Descriptors: Scoring, Generalizability Theory, Interrater Reliability, Foreign Countries
Peer reviewed Peer reviewed
Direct linkDirect link
Peabody, Michael R. – Applied Measurement in Education, 2020
The purpose of the current article is to introduce the equating and evaluation methods used in this special issue. Although a comprehensive review of all existing models and methodologies would be impractical given the format, a brief introduction to some of the more popular models will be provided. A brief discussion of the conditions required…
Descriptors: Evaluation Methods, Equated Scores, Sample Size, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Geisinger, Kurt F. – Applied Measurement in Education, 2019
This brief article introduces the topic of intelligence as highly appropriate for educational measurement professionals. It describes some of the uses of intelligence tests both historically and currently. It argues why knowledge of intelligence theory and intelligence testing is important for educational measurement professionals. The articles…
Descriptors: Intelligence Tests, Intelligence, Models, Educational Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Canivez, Gary L.; Youngstrom, Eric A. – Applied Measurement in Education, 2019
The Cattell-Horn-Carroll (CHC) taxonomy of cognitive abilities married John Horn and Raymond Cattell's Extended Gf-Gc theory with John Carroll's Three-Stratum Theory. While there are some similarities in arrangements or classifications of tasks (observed variables) within similar broad or narrow dimensions, other salient theoretical features and…
Descriptors: Taxonomy, Cognitive Ability, Intelligence, Cognitive Tests
Peer reviewed Peer reviewed
Direct linkDirect link
McGill, Ryan J.; Dombrowski, Stefan C. – Applied Measurement in Education, 2019
The Cattell-Horn-Carroll (CHC) model presently serves as a blueprint for both test development and a taxonomy for clinical interpretation of modern tests of cognitive ability. Accordingly, the trend among test publishers has been toward creating tests that provide users with an ever-increasing array of scores that comport with CHC. However, an…
Descriptors: Models, Cognitive Ability, Intelligence Tests, Intelligence
Peer reviewed Peer reviewed
Direct linkDirect link
Thompson, W. Jake; Clark, Amy K.; Nash, Brooke – Applied Measurement in Education, 2019
As the use of diagnostic assessment systems transitions from research applications to large-scale assessments for accountability purposes, reliability methods that provide evidence at each level of reporting are needed. The purpose of this paper is to summarize one simulation-based method for estimating and reporting reliability for an…
Descriptors: Test Reliability, Diagnostic Tests, Classification, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Confrey, Jere; Toutkoushian, Emily; Shah, Meetal – Applied Measurement in Education, 2019
Fully articulating validation arguments in the context of classroom assessment requires connecting evidence from multiple sources and addressing multiple types of validity in a coherent chain of reasoning. This type of validation argument is particularly complex for assessments that function in close proximity to instruction, address the fine…
Descriptors: Test Validity, Item Response Theory, Middle School Students, Mathematics Instruction
Peer reviewed Peer reviewed
Direct linkDirect link
Jacobson, Erik; Svetina, Dubravka – Applied Measurement in Education, 2019
Contingent argument-based approaches to validity require a unique argument for each use, in contrast to more prescriptive approaches that identify the common kinds of validity evidence researchers should consider for every use. In this article, we evaluate our use of an approach that is both prescriptive "and" argument-based to develop a…
Descriptors: Test Validity, Test Items, Test Construction, Test Interpretation
Peer reviewed Peer reviewed
Direct linkDirect link
Carney, Michele; Crawford, Angela; Siebert, Carl; Osguthorpe, Rich; Thiede, Keith – Applied Measurement in Education, 2019
The "Standards for Educational and Psychological Testing" recommend an argument-based approach to validation that involves a clear statement of the intended interpretation and use of test scores, the identification of the underlying assumptions and inferences in that statement--termed the interpretation/use argument, and gathering of…
Descriptors: Inquiry, Test Interpretation, Validity, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Gotwals, Amelia Wenk – Applied Measurement in Education, 2018
In this commentary, I consider the three empirical studies in this special issue based on two main aspects: (a) the nature of the learning progressions and (b) what formative assessment practice(s) were investigated. Specifically, I describe differences among the learning progressions in terms of scope and grain size. I also identify three…
Descriptors: Skill Development, Behavioral Objectives, Formative Evaluation, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Shermis, Mark D. – Applied Measurement in Education, 2018
This article employs the Common European Framework Reference for Language Acquisition (CEFR) as a basis for evaluating writing in the context of machine scoring. The CEFR was designed as a framework for evaluating proficiency levels of speaking for the 49 languages comprising the European Union. The intent was to impact language instruction so…
Descriptors: Scoring, Automation, Essays, Language Proficiency
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  15