Publication Date
| In 2015 | 5 |
| Since 2014 | 25 |
| Since 2011 (last 5 years) | 71 |
| Since 2006 (last 10 years) | 170 |
| Since 1996 (last 20 years) | 359 |
Descriptor
Source
| Applied Measurement in… | 520 |
Author
| Hambleton, Ronald K. | 15 |
| Plake, Barbara S. | 9 |
| Shavelson, Richard J. | 9 |
| Sireci, Stephen G. | 9 |
| Ercikan, Kadriye | 8 |
| Engelhard, George, Jr. | 7 |
| Feldt, Leonard S. | 7 |
| Linn, Robert L. | 7 |
| Pomplun, Mark | 7 |
| Wise, Steven L. | 7 |
| More ▼ | |
Publication Type
Education Level
| Elementary Secondary Education | 30 |
| Grade 8 | 21 |
| High Schools | 21 |
| Higher Education | 21 |
| Secondary Education | 19 |
| Elementary Education | 17 |
| Grade 5 | 16 |
| Middle Schools | 14 |
| Grade 4 | 13 |
| Grade 3 | 12 |
| More ▼ | |
Audience
| Researchers | 3 |
| Teachers | 2 |
| Administrators | 1 |
Showing 16 to 30 of 520 results
Chia, Magda Y. – Applied Measurement in Education, 2014
The Smarter Balanced Assessment Consortium (Smarter Balanced) serves over 19 million primary, middle, and high school students from across 26 states and affiliates (Smarter Balanced, n.d). As one of the two Race to the Top (RTT)-funded assessment consortia, Smarter Balanced is responsible for developing formative, interim, and summative…
Descriptors: State Standards, Academic Standards, Educational Assessment, English Language Learners
Michaelides, Michalis P.; Haertel, Edward H. – Applied Measurement in Education, 2014
The standard error of equating quantifies the variability in the estimation of an equating function. Because common items for deriving equated scores are treated as fixed, the only source of variability typically considered arises from the estimation of common-item parameters from responses of samples of examinees. Use of alternative, equally…
Descriptors: Equated Scores, Test Items, Sampling, Statistical Inference
Clauser, Jerome C.; Clauser, Brian E.; Hambleton, Ronald K. – Applied Measurement in Education, 2014
The purpose of the present study was to extend past work with the Angoff method for setting standards by examining judgments at the judge level rather than the panel level. The focus was on investigating the relationship between observed Angoff standard setting judgments and empirical conditional probabilities. This relationship has been used as a…
Descriptors: Standard Setting (Scoring), Validity, Reliability, Correlation
Humphry, Stephen; Heldsinger, Sandra; Andrich, David – Applied Measurement in Education, 2014
One of the best-known methods for setting a benchmark standard on a test is that of Angoff and its modifications. When scored dichotomously, judges estimate the probability that a benchmark student has of answering each item correctly. As in most methods of standard setting, it is assumed implicitly that the unit of the latent scale of the…
Descriptors: Foreign Countries, Standard Setting (Scoring), Judges, Item Response Theory
Welsh, Megan E.; Eastwood, Melissa; D'Agostino, Jerome V. – Applied Measurement in Education, 2014
Teacher and school accountability systems based on high-stakes tests are ubiquitous throughout the United States and appear to be growing as a catalyst for reform. As a result, educators have increased the proportion of instructional time devoted to test preparation. Although guidelines for what constitutes appropriate and inappropriate test…
Descriptors: High Stakes Tests, Instruction, Test Preparation, Grade 3
Eklöf, Hanna; Pavešic, Barbara Japelj; Grønmo, Liv Sissel – Applied Measurement in Education, 2014
The purpose of the study was to measure students' reported test-taking effort and the relationship between reported effort and performance on the Trends in International Mathematics and Science Study (TIMSS) Advanced mathematics test. This was done in three countries participating in TIMSS Advanced 2008 (Sweden, Norway, and Slovenia), and the…
Descriptors: Mathematics Tests, Cross Cultural Studies, Foreign Countries, Correlation
Deunk, Marjolein I.; van Kuijk, Mechteld F.; Bosker, Roel J. – Applied Measurement in Education, 2014
Standard setting methods, like the Bookmark procedure, are used to assist education experts in formulating performance standards. Small group discussion is meant to help these experts in setting more reliable and valid cutoff scores. This study is an analysis of 15 small group discussions during two standards setting trajectories and their effect…
Descriptors: Cutting Scores, Standard Setting, Group Discussion, Reading Tests
Schweig, Jonathan David – Applied Measurement in Education, 2014
Developing indicators that reflect important aspects of school and classroom environments has become central in a nationwide effort to develop comprehensive programs that measure teacher quality and effectiveness. Formulating teacher evaluation policy necessitates accurate and reliable methods for measuring these environmental variables. This…
Descriptors: Error of Measurement, Educational Environment, Classroom Environment, Surveys
Rutkowski, Leslie – Applied Measurement in Education, 2014
Large-scale assessment programs such as the National Assessment of Educational Progress (NAEP), Trends in International Mathematics and Science Study (TIMSS), and Programme for International Student Assessment (PISA) use a sophisticated assessment administration design called matrix sampling that minimizes the testing burden on individual…
Descriptors: Measurement, Testing, Item Sampling, Computation
Steedle, Jeffrey T. – Applied Measurement in Education, 2014
Possible lack of motivation is a perpetual concern when tests have no stakes attached to performance. Specifically, the validity of test score interpretations may be compromised when examinees are unmotivated to exert their best efforts. Motivation filtering, a procedure that filters out apparently unmotivated examinees, was applied to the…
Descriptors: College Outcomes Assessment, Student Motivation, Sampling, Validity
Taylor, Melinda Ann; Pastor, Dena A. – Applied Measurement in Education, 2013
Although federal regulations require testing students with severe cognitive disabilities, there is little guidance regarding how technical quality should be established. It is known that challenges exist with documentation of the reliability of scores for alternate assessments. Typical measures of reliability do little in modeling multiple sources…
Descriptors: Generalizability Theory, Alternative Assessment, Test Reliability, Scores
Hickendorff, Marian – Applied Measurement in Education, 2013
The results of an exploratory study into measurement of elementary mathematics ability are presented. The focus is on the abilities involved in solving standard computation problems on the one hand and problems presented in a realistic context on the other. The objectives were to assess to what extent these abilities are shared or distinct, and…
Descriptors: Elementary School Mathematics, Mathematics Tests, Mathematics Skills, Problem Solving
Hansen, Mary A.; Lyon, Steven R.; Heh, Peter; Zigmond, Naomi – Applied Measurement in Education, 2013
Large-scale assessment programs, including alternate assessments based on alternate achievement standards (AA-AAS), must provide evidence of technical quality and validity. This study provides information about the technical quality of one AA-AAS by evaluating the standard setting for the science component. The assessment was designed to have…
Descriptors: Alternative Assessment, Science Tests, Standard Setting, Test Validity
Cheong, Yuk Fai; Kamata, Akihito – Applied Measurement in Education, 2013
In this article, we discuss and illustrate two centering and anchoring options available in differential item functioning (DIF) detection studies based on the hierarchical generalized linear and generalized linear mixed modeling frameworks. We compared and contrasted the assumptions of the two options, and examined the properties of their DIF…
Descriptors: Test Bias, Hierarchical Linear Modeling, Comparative Analysis, Test Items
Sawyer, Richard – Applied Measurement in Education, 2013
Correlational evidence suggests that high school GPA is better than admission test scores in predicting first-year college GPA, although test scores have incremental predictive validity. The usefulness of a selection variable in making admission decisions depends in part on its predictive validity, but also on institutions' selectivity and…
Descriptors: High Schools, Grade Point Average, College Entrance Examinations, College Admission

Peer reviewed
Direct link
