Publication Date
In 2023 | 0 |
Since 2022 | 14 |
Since 2019 (last 5 years) | 88 |
Since 2014 (last 10 years) | 199 |
Since 2004 (last 20 years) | 377 |
Descriptor
Source
Applied Measurement in… | 694 |
Author
Publication Type
Education Level
Audience
Researchers | 3 |
Teachers | 2 |
Administrators | 1 |
Location
Canada | 13 |
California | 7 |
Netherlands | 6 |
Australia | 5 |
Israel | 5 |
New York | 5 |
North Carolina | 5 |
Texas | 5 |
United States | 5 |
Arizona | 4 |
Florida | 4 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 8 |
Race to the Top | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Han, Yuting; Wilson, Mark – Applied Measurement in Education, 2022
A technology-based problem-solving test can automatically capture all the actions of students when they complete tasks and save them as process data. Response sequences are the external manifestations of the latent intellectual activities of the students, and it contains rich information about students' abilities and different problem-solving…
Descriptors: Technology Uses in Education, Problem Solving, 21st Century Skills, Evaluation Methods
Katz, Daniel; Huggins-Manley, Anne Corinne; Leite, Walter – Applied Measurement in Education, 2022
According to the "Standards for Educational and Psychological Testing" (2014), one aspect of test fairness concerns examinees having comparable opportunities to learn prior to taking tests. Meanwhile, many researchers are developing platforms enhanced by artificial intelligence (AI) that can personalize curriculum to individual student…
Descriptors: High Stakes Tests, Test Bias, Testing Problems, Prior Learning
Carney, Michele; Paulding, Katie; Champion, Joe – Applied Measurement in Education, 2022
Teachers need ways to efficiently assess students' cognitive understanding. One promising approach involves easily adapted and administered item types that yield quantitative scores that can be interpreted in terms of whether or not students likely possess key understandings. This study illustrates an approach to analyzing response process…
Descriptors: Middle School Students, Logical Thinking, Mathematical Logic, Problem Solving
van Alphen, Thijmen; Jak, Suzanne; Jansen in de Wal, Joost; Schuitema, Jaap; Peetsma, Thea – Applied Measurement in Education, 2022
Intensive longitudinal data is increasingly used to study state-like processes such as changes in daily stress. Measures aimed at collecting such data require the same level of scrutiny regarding scale reliability as traditional questionnaires. The most prevalent methods used to assess reliability of intensive longitudinal measures are based on…
Descriptors: Test Reliability, Measures (Individuals), Anxiety, Data Collection
Clark, Amy K.; Nash, Brooke; Karvonen, Meagan – Applied Measurement in Education, 2022
Assessments scored with diagnostic models are increasingly popular because they provide fine-grained information about student achievement. Because of differences in how diagnostic assessments are scored and how results are used, the information teachers must know to interpret and use results may differ from concepts traditionally included in…
Descriptors: Elementary School Teachers, Secondary School Teachers, Assessment Literacy, Diagnostic Tests
Song, Yoon Ah; Lee, Won-Chan – Applied Measurement in Education, 2022
This article presents the performance of item response theory (IRT) models when double ratings are used as item scores over single ratings when rater effects are present. Study 1 examined the influence of the number of ratings on the accuracy of proficiency estimation in the generalized partial credit model (GPCM). Study 2 compared the accuracy of…
Descriptors: Item Response Theory, Item Analysis, Scores, Accuracy
Abulela, Mohammed A. A.; Rios, Joseph A. – Applied Measurement in Education, 2022
When there are no personal consequences associated with test performance for examinees, rapid guessing (RG) is a concern and can differ between subgroups. To date, the impact of differential RG on item-level measurement invariance has received minimal attention. To that end, a simulation study was conducted to examine the robustness of the…
Descriptors: Comparative Analysis, Robustness (Statistics), Nonparametric Statistics, Item Analysis
Silva Diaz, John Alexander; Köhler, Carmen; Hartig, Johannes – Applied Measurement in Education, 2022
Testing item fit is central in item response theory (IRT) modeling, since a good fit is necessary to draw valid inferences from estimated model parameters. "Infit" and "outfit" fit statistics, widespread indices for detecting deviations from the Rasch model, are affected by data factors, such as sample size. Consequently, the…
Descriptors: Intervals, Item Response Theory, Item Analysis, Inferences
Jonson, Jessica L. – Applied Measurement in Education, 2022
This article describes a grant project that generated a technical guide for PK-12 educators who are utilizing social and emotional learning (SEL) assessments for educational improvement purposes. The guide was developed over a two-year period with funding from the Spencer Foundation. The result was the collective contribution of a widely…
Descriptors: Measurement Techniques, Tests, Preschool Teachers, Kindergarten
Lions, Séverin; Monsalve, Carlos; Dartnell, Pablo; Blanco, María Paz; Ortega, Gabriel; Lemarié, Julie – Applied Measurement in Education, 2022
Multiple-choice tests are widely used in education, often for high-stakes assessment purposes. Consequently, these tests should be constructed following the highest standards. Many efforts have been undertaken to advance item-writing guidelines intended to improve tests. One important issue is the unwanted effects of the options' position on test…
Descriptors: Multiple Choice Tests, High Stakes Tests, Test Construction, Guidelines
Xu, Jiajun; Dadey, Nathan – Applied Measurement in Education, 2022
This paper explores how student performance across the full set of multiple modular assessments of individual standards, which we refer to as mini-assessments, from a large scale, operational program of interim assessment can be summarized using Bayesian networks. We follow a completely data-driven approach in which no constraints are imposed to…
Descriptors: Bayesian Statistics, Learning Analytics, Scores, Academic Achievement
Ferrara, Steve; Steedle, Jeffrey T.; Frantz, Roger S. – Applied Measurement in Education, 2022
Item difficulty modeling studies involve (a) hypothesizing item features, or item response demands, that are likely to predict item difficulty with some degree of accuracy; and (b) entering the features as independent variables into a regression equation or other statistical model to predict difficulty. In this review, we report findings from 13…
Descriptors: Reading Comprehension, Reading Tests, Test Items, Item Response Theory
Pools, Elodie – Applied Measurement in Education, 2022
Many low-stakes assessments, such as international large-scale surveys, are administered during time-limited testing sessions and some test-takers are not able to endorse the last items of the test, resulting in not-reached (NR) items. However, because the test has no consequence for the respondents, these NR items can also stem from quitting the…
Descriptors: Achievement Tests, Foreign Countries, International Assessment, Secondary School Students
Rios, Joseph A. – Applied Measurement in Education, 2022
Testing programs are confronted with the decision of whether to report individual scores for examinees that have engaged in rapid guessing (RG). As noted by the "Standards for Educational and Psychological Testing," this decision should be based on a documented criterion that determines score exclusion. To this end, a number of heuristic…
Descriptors: Testing, Guessing (Tests), Academic Ability, Scores
Bjermo, Jonas; Miller, Frank – Applied Measurement in Education, 2021
In recent years, the interest in measuring growth in student ability in various subjects between different grades in school has increased. Therefore, good precision in the estimated growth is of importance. This paper aims to compare estimation methods and test designs when it comes to precision and bias of the estimated growth of mean ability…
Descriptors: Scaling, Ability, Computation, Test Items