Publication Date
| In 2015 | 1 |
| Since 2014 | 4 |
| Since 2011 (last 5 years) | 9 |
| Since 2006 (last 10 years) | 16 |
| Since 1996 (last 20 years) | 23 |
Descriptor
| Equated Scores | 40 |
| Item Response Theory | 18 |
| Sampling | 10 |
| Statistical Analysis | 10 |
| Evaluation Methods | 7 |
| Test Items | 7 |
| Simulation | 6 |
| Test Construction | 6 |
| Comparative Analysis | 5 |
| Difficulty Level | 5 |
| More ▼ | |
Source
| Applied Measurement in… | 40 |
Author
| Kim, Sooyeon | 4 |
| Dorans, Neil J. | 3 |
| Hambleton, Ronald K. | 3 |
| Kolen, Michael J. | 3 |
| Linn, Robert L. | 3 |
| Livingston, Samuel A. | 3 |
| Antal, Judit | 2 |
| Brennan, Robert L. | 2 |
| Hanson, Bradley A. | 2 |
| Lawrence, Ida M. | 2 |
| More ▼ | |
Publication Type
| Journal Articles | 40 |
| Reports - Evaluative | 21 |
| Reports - Research | 17 |
| Speeches/Meeting Papers | 3 |
| Information Analyses | 2 |
| Reports - Descriptive | 2 |
Education Level
| Secondary Education | 3 |
| Grade 8 | 2 |
| High Schools | 2 |
| Higher Education | 2 |
| Postsecondary Education | 2 |
| Elementary Secondary Education | 1 |
| Grade 10 | 1 |
| Grade 11 | 1 |
| Grade 4 | 1 |
| Grade 7 | 1 |
| More ▼ | |
Audience
Showing 1 to 15 of 40 results
Phillips, Gary W. – Applied Measurement in Education, 2015
This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…
Descriptors: State Programs, Sampling, Research Design, Error of Measurement
Antal, Judit; Proctor, Thomas P.; Melican, Gerald J. – Applied Measurement in Education, 2014
In common-item equating the anchor block is generally built to represent a miniature form of the total test in terms of content and statistical specifications. The statistical properties frequently reflect equal mean and spread of item difficulty. Sinharay and Holland (2007) suggested that the requirement for equal spread of difficulty may be too…
Descriptors: Test Items, Equated Scores, Difficulty Level, Item Response Theory
Wells, Craig S.; Hambleton, Ronald K.; Kirkpatrick, Robert; Meng, Yu – Applied Measurement in Education, 2014
The purpose of the present study was to develop and evaluate two procedures flagging consequential item parameter drift (IPD) in an operational testing program. The first procedure was based on flagging items that exhibit a meaningful magnitude of IPD using a critical value that was defined to represent barely tolerable IPD. The second procedure…
Descriptors: Test Items, Test Bias, Equated Scores, Item Response Theory
Michaelides, Michalis P.; Haertel, Edward H. – Applied Measurement in Education, 2014
The standard error of equating quantifies the variability in the estimation of an equating function. Because common items for deriving equated scores are treated as fixed, the only source of variability typically considered arises from the estimation of common-item parameters from responses of samples of examinees. Use of alternative, equally…
Descriptors: Equated Scores, Test Items, Sampling, Statistical Inference
Wyse, Adam E.; Dean, Vincent J.; Viger, Steven G.; Vansickle, Timothy R. – Applied Measurement in Education, 2013
The development of alternate assessments for students with disabilities plays a pivotal role in state and national accountability systems. An important assumption in the use of alternate assessments in these accountability systems is that scores are comparable on different test forms across diverse groups of students over time. The use of test…
Descriptors: Equated Scores, Alternative Assessment, Disabilities, Case Studies
Kim, Sooyeon; Walker, Michael – Applied Measurement in Education, 2012
This study examined the appropriateness of the anchor composition in a mixed-format test, which includes both multiple-choice (MC) and constructed-response (CR) items, using subpopulation invariance indices. Linking functions were derived in the nonequivalent groups with anchor test (NEAT) design using two types of anchor sets: (a) MC only and (b)…
Descriptors: Multiple Choice Tests, Test Format, Test Items, Equated Scores
Kim, Sooyeon; Walker, Michael E. – Applied Measurement in Education, 2012
This study investigated the impact of repeat takers of a licensure test on the equating functions in the context of a nonequivalent groups with anchor test (NEAT) design. Examinees who had taken a new, to-be-equated form of the test were divided into three subgroups according to their previous testing experience: (a) repeaters who previously took…
Descriptors: Equated Scores, Licensing Examinations (Professions), Repetition, Regression (Statistics)
Kim, Sooyeon; von Davier, Alina A.; Haberman, Shelby – Applied Measurement in Education, 2011
The synthetic function is a weighted average of the identity (the linking function for forms that are known to be completely parallel) and a traditional equating method. The purpose of the present study was to investigate the benefits of the synthetic function on small-sample equating using various real data sets gathered from different…
Descriptors: Testing Programs, Equated Scores, Investigations, Data Analysis
Kim, Sooyeon; Livingston, Samuel A.; Lewis, Charles – Applied Measurement in Education, 2011
This article describes a preliminary investigation of an empirical Bayes (EB) procedure for using collateral information to improve equating of scores on test forms taken by small numbers of examinees. Resampling studies were done on two different forms of the same test. In each study, EB and non-EB versions of two equating methods--chained linear…
Descriptors: Sample Size, Equated Scores, Bayesian Statistics, Accuracy
Kim, HeeKyoung; Kolen, Michael J. – Applied Measurement in Education, 2010
Test equating might be affected by including in the equating analyses examinees who have taken the test previously. This study evaluated the effect of including such repeaters on Medical College Admission Test (MCAT) equating using a population invariance approach. Three-parameter logistic (3-PL) item response theory (IRT) true score and…
Descriptors: Repetition, Equated Scores, College Entrance Examinations, Medical Schools
Brennan, Robert L. – Applied Measurement in Education, 2010
This paper provides an overview of evidence-centered assessment design (ECD) and some general information about of the Advanced Placement (AP[R]) Program. Then the papers in this special issue are discussed, as they relate to the use of ECD in the revision of various AP tests. This paper concludes with some observations about the need to validate…
Descriptors: Advanced Placement Programs, Equivalency Tests, Evidence, Test Construction
Livingston, Samuel A.; Antal, Judit – Applied Measurement in Education, 2010
A simultaneous equating of four new test forms to each other and to one previous form was accomplished through a complex design incorporating seven separate equating links. Each new form was linked to the reference form by four different paths, and each path produced a different score conversion. The procedure used to resolve these inconsistencies…
Descriptors: Measurement Techniques, Measurement, Educational Assessment, Educational Testing
Lee, Won-Chan; Ban, Jae-Chun – Applied Measurement in Education, 2010
Various applications of item response theory often require linking to achieve a common scale for item parameter estimates obtained from different groups. This article used a simulation to examine the relative performance of four different item response theory (IRT) linking procedures in a random groups equating design: concurrent calibration with…
Descriptors: Item Response Theory, Simulation, Comparative Analysis, Measurement Techniques
Taylor, Catherine S.; Lee, Yoonsun – Applied Measurement in Education, 2010
Item response theory (IRT) methods are generally used to create score scales for large-scale tests. Research has shown that IRT scales are stable across groups and over time. Most studies have focused on items that are dichotomously scored. Now Rasch and other IRT models are used to create scales for tests that include polytomously scored items.…
Descriptors: Measures (Individuals), Item Response Theory, Robustness (Statistics), Item Analysis
Wells, Craig S.; Baldwin, Su; Hambleton, Ronald K.; Sireci, Stephen G.; Karatonis, Ana; Jirka, Stephen – Applied Measurement in Education, 2009
Score equity assessment is an important analysis to ensure inferences drawn from test scores are comparable across subgroups of examinees. The purpose of the present evaluation was to assess the extent to which the Grade 8 NAEP Math and Reading assessments for 2005 were equivalent across selected states. More specifically, the present study…
Descriptors: National Competency Tests, Test Bias, Equated Scores, Grade 8

Peer reviewed
Direct link
