Publication Date
| In 2015 | 2 |
| Since 2014 | 2 |
| Since 2011 (last 5 years) | 6 |
| Since 2006 (last 10 years) | 22 |
| Since 1996 (last 20 years) | 39 |
Descriptor
| Evaluation Methods | 71 |
| Performance Based Assessment | 17 |
| Item Response Theory | 16 |
| Student Evaluation | 16 |
| Educational Assessment | 15 |
| Test Items | 15 |
| Test Construction | 14 |
| Measurement Techniques | 12 |
| Simulation | 11 |
| Scores | 10 |
| More ▼ | |
Source
| Applied Measurement in… | 71 |
Author
| Penfield, Randall D. | 3 |
| Plake, Barbara S. | 3 |
| Byrne, Barbara M. | 2 |
| Finch, Holmes | 2 |
| Hambleton, Ronald K. | 2 |
| Livingston, Samuel A. | 2 |
| Monahan, Patrick | 2 |
| Popham, W. James | 2 |
| Sireci, Stephen G. | 2 |
| Su, Ya-Hui | 2 |
| More ▼ | |
Publication Type
| Journal Articles | 71 |
| Reports - Evaluative | 34 |
| Reports - Research | 29 |
| Information Analyses | 8 |
| Reports - Descriptive | 5 |
| Speeches/Meeting Papers | 2 |
| Guides - Non-Classroom | 1 |
| Opinion Papers | 1 |
| Reports - General | 1 |
Education Level
| Elementary Secondary Education | 3 |
| High Schools | 3 |
| Higher Education | 3 |
| Grade 10 | 1 |
| Grade 12 | 1 |
| Middle Schools | 1 |
| Postsecondary Education | 1 |
Audience
| Researchers | 2 |
Showing 1 to 15 of 71 results
Phillips, Gary W. – Applied Measurement in Education, 2015
This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…
Descriptors: State Programs, Sampling, Research Design, Error of Measurement
McClintock, Joseph Clair – Applied Measurement in Education, 2015
Erasure analysis is the study of the pattern or quantity of erasures on multiple-choice paper-and-pencil examinations, to determine whether erasures were made post-testing for the purpose of unfairly increasing students' scores. This study examined the erasure data from over 1.4 million exams, taken by more than 600,000 students. Three…
Descriptors: Multiple Choice Tests, Cheating, Methods, Computation
Kim, Sooyeon; von Davier, Alina A.; Haberman, Shelby – Applied Measurement in Education, 2011
The synthetic function is a weighted average of the identity (the linking function for forms that are known to be completely parallel) and a traditional equating method. The purpose of the present study was to investigate the benefits of the synthetic function on small-sample equating using various real data sets gathered from different…
Descriptors: Testing Programs, Equated Scores, Investigations, Data Analysis
Sinha, Ruchi; Oswald, Frederick; Imus, Anna; Schmitt, Neal – Applied Measurement in Education, 2011
The current study examines how using a multidimensional battery of predictors (high-school grade point average (GPA), SAT/ACT, and biodata), and weighting the predictors based on the different values institutions place on various student performance dimensions (college GPA, organizational citizenship behaviors (OCBs), and behaviorally anchored…
Descriptors: Grade Point Average, Interrater Reliability, Rating Scales, College Admission
Lee, Hee-Sun; Liu, Ou Lydia; Linn, Marcia C. – Applied Measurement in Education, 2011
This study explores measurement of a construct called knowledge integration in science using multiple-choice and explanation items. We use construct and instructional validity evidence to examine the role multiple-choice and explanation items plays in measuring students' knowledge integration ability. For construct validity, we analyze item…
Descriptors: Knowledge Level, Construct Validity, Validity, Scaffolding (Teaching Technique)
Swerdzewski, Peter J.; Harmes, J. Christine; Finney, Sara J. – Applied Measurement in Education, 2011
Many universities rely on data gathered from tests that are low stakes for examinees but high stakes for the various programs being assessed. Given the lack of consequences associated with many collegiate assessments, the construct-irrelevant variance introduced by unmotivated students is potentially a serious threat to the validity of the…
Descriptors: Computer Assisted Testing, Student Motivation, Inferences, Universities
Livingston, Samuel A.; Antal, Judit – Applied Measurement in Education, 2010
A simultaneous equating of four new test forms to each other and to one previous form was accomplished through a complex design incorporating seven separate equating links. Each new form was linked to the reference form by four different paths, and each path produced a different score conversion. The procedure used to resolve these inconsistencies…
Descriptors: Measurement Techniques, Measurement, Educational Assessment, Educational Testing
Lee, Won-Chan; Ban, Jae-Chun – Applied Measurement in Education, 2010
Various applications of item response theory often require linking to achieve a common scale for item parameter estimates obtained from different groups. This article used a simulation to examine the relative performance of four different item response theory (IRT) linking procedures in a random groups equating design: concurrent calibration with…
Descriptors: Item Response Theory, Simulation, Comparative Analysis, Measurement Techniques
Davis, Susan L.; Buckendahl, Chad W. – Applied Measurement in Education, 2009
In response to a Congressional mandate, an evaluation of the National Assessment of Educational Progress (NAEP) was undertaken beginning in 2004. The evaluation design included a series of studies that encompassed the breadth and selected areas of depth of the NAEP program. Studies were identified with input from key stakeholders and were…
Descriptors: National Competency Tests, Evaluation Methods, Evaluation Criteria, Test Results
Sireci, Stephen G.; Hauger, Jeffrey B.; Wells, Craig S.; Shea, Christine; Zenisky, April L. – Applied Measurement in Education, 2009
The National Assessment Governing Board used a new method to set achievement level standards on the 2005 Grade 12 NAEP Math test. In this article, we summarize our independent evaluation of the process used to set these standards. The evaluation data included observations of the standard-setting meeting, observations of advisory committee meetings…
Descriptors: Advisory Committees, Mathematics Tests, Standard Setting, National Competency Tests
Zhang, Bo; Ohland, Matthew W. – Applied Measurement in Education, 2009
One major challenge in using group projects to assess student learning is accounting for the differences of contribution among group members so that the mark assigned to each individual actually reflects their performance. This research addresses the validity of grading group projects by evaluating different methods that derive individualized…
Descriptors: Monte Carlo Methods, Validity, Student Evaluation, Evaluation Methods
Puhan, Gautam – Applied Measurement in Education, 2009
The purpose of this study is to determine the extent of scale drift on a test that employs cut scores. It was essential to examine scale drift for this testing program because new forms in this testing program are often put on scale through a series of intermediate equatings (known as equating chains). This process may cause equating error to…
Descriptors: Testing Programs, Testing, Measurement Techniques, Item Response Theory
Sun, Koun-Tem; Chen, Yu-Jen; Tsai, Shu-Yen; Cheng, Chien-Fen – Applied Measurement in Education, 2008
In educational measurement, the construction of parallel test forms is often a combinatorial optimization problem that involves the time-consuming selection of items to construct tests having approximately the same test information functions (TIFs) and constraints. This article proposes a novel method, genetic algorithm (GA), to construct parallel…
Descriptors: Test Format, Measurement Techniques, Equations (Mathematics), Item Response Theory
Finch, Holmes; Monahan, Patrick – Applied Measurement in Education, 2008
This article introduces a bootstrap generalization to the Modified Parallel Analysis (MPA) method of test dimensionality assessment using factor analysis. This methodology, based on the use of Marginal Maximum Likelihood nonlinear factor analysis, provides for the calculation of a test statistic based on a parametric bootstrap using the MPA…
Descriptors: Monte Carlo Methods, Factor Analysis, Generalization, Methods
Briggs, Derek C. – Applied Measurement in Education, 2008
This article illustrates the use of an explanatory item response modeling (EIRM) approach in the context of measuring group differences in science achievement. The distinction between item response models and EIRMs, recently elaborated by De Boeck and Wilson (2004), is presented within the statistical framework of generalized linear mixed models.…
Descriptors: Science Achievement, Science Tests, Measurement, Error of Measurement

Peer reviewed
Direct link
