NotesFAQContact Us
Search Tips
ERIC Number: ED514293
Record Type: Non-Journal
Publication Date: 2010
Pages: 120
Abstractor: As Provided
ISBN: ISBN-978-1-1096-9219-8
Comparability of Examinee Proficiency Scores on Computer Adaptive Tests Using Real and Simulated Data
Evans, Josiah Jeremiah
ProQuest LLC, Ph.D. Dissertation, Rutgers The State University of New Jersey - New Brunswick
In measurement research, data simulations are a commonly used analytical technique. While simulation designs have many benefits, it is unclear if these artificially generated datasets are able to accurately capture real examinee item response behaviors. This potential lack of comparability may have important implications for administration of computer adaptive tests (CAT) which display proficiency-targeted items to examinees. In addressing this problem, this study sought to compare results from real testing data to that of simulated data to determine the extent to which simulated data are an accurate representation of real-world testing data. Specifically, this study matched real examination data from multiple administrations of the Law School Admission Test to create a single large dataset with 534 items and 5,000 synthetic examinees. From this dataset examinee proficiency estimates and item parameters were obtained, which were used to create 100 simulated item response datasets. Both real and simulated data were utilized in two posthoc testing formats: CAT and linear format examinations. The CAT administrations used the item-level adaptive method; the linear tests were constructed by selecting items using stratified random sampling. In addition to the two data types and two test administration formats, the impact of three varying test lengths (25, 35, and 50 items) on proficiency estimation was examined. For linear tests, results demonstrated that replication of original proficiency estimates from simulated data was variable, depending on test length, items selected, and examinee proficiency levels. Randomly constructed linear tests with extreme item parameter values resulted in test instability which yielded less accurate proficiency recovery. For most datasets, CAT format tests yielded improved true proficiency recovery as compared to their linear test counterparts. Generally, the longest length 50-item CAT simulated data tests yielded the best replication of original real data proficiency estimates. CAT format tests performed well given real or simulated data, whereas linear tests displayed more performance variation compared to their CAT counterparts. The tails of the distributions showed the greatest variation between data types and conditions. The results of this dissertation support the use of simulated data when the items used to construct the tests reflect non-extreme item parameter values. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page:]
ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site:
Publication Type: Dissertations/Theses - Doctoral Dissertations
Education Level: Higher Education
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Identifiers - Assessments and Surveys: Law School Admission Test