NotesFAQContact Us
Search Tips
Peer reviewed Peer reviewed
PDF on ERIC Download full text
ERIC Number: ED524806
Record Type: Non-Journal
Publication Date: 2011-Oct
Pages: 114
Abstractor: ERIC
Reference Count: 50
Estimating the Impacts of Educational Interventions Using State Tests or Study-Administered Tests. NCEE 2012-4016
Olsen, Robert B.; Unlu, Fatih; Price, Cristofer; Jaciw, Andrew P.
National Center for Education Evaluation and Regional Assistance
This report examines the differences in impact estimates and standard errors that arise when these are derived using state achievement tests only (as pre-tests and post-tests), study-administered tests only, or some combination of state- and study-administered tests. State tests may yield different evaluation results relative to a test that is selected, and administered, by the research team for several reasons. For instance, (1) because state tests vary in content and emphasis, they also can vary in their coverage of the types of knowledge and skills targeted by any given intervention. In contrast, a study-administered test will correspond to the intervention being evaluated. In addition to differences in alignment with treatment, state tests may yield divergent evaluation results due to differences in (2) the stakes associated with the test, (3) missing data, (4) the timing of the tests, (5) reliability or measurement error, and (6) alignment between pre-test and post-test. Olsen, Unlu, Jaciw, and Price (2011) discuss how these six factors may differ between state- and study-administered tests to influence the findings from an impact evaluation. Specifically, Olsen et al. use data from three single-state, small-scale evaluations of reading interventions that collected outcomes data using both study-administered and state achievement tests to examine this and other issues. The authors found that (1) impact estimates based on study-administered tests had smaller standard errors than impact estimates based on state tests, (2) impacts estimates from models with "mismatched" pre-tests (e.g., a state pre-test used in combination with a study-administered post-test) had larger standard errors than impact estimates from models with matched pre-tests, and (3) impact estimates from models that included a second pre-test covariate had smaller standard errors than impact estimates from models that included a single pre-test covariate. Study authors caution that their results may not generalize to evaluations conducted in other states, with different study-administered tests, or with other student samples. Appended are: (1) Description of the Three Experiments; (2) Scatter Plots of Student Test Scores; (3) Quartiles of the Test Score Distribution; (4) Estimates from Other Evaluations; (5) Estimates from the Full Sample; (6) Hypothesis Tests and Minimum Detectable Differences; (7) Conceptual Approach to Generating Correlated Residuals for the Parametric Bootstrap; (8) Results from Bootstrapping and Hypothesis Testing; (9) Differences in Sample Size Requirements; (10) Correlations between Scores on State and Study-Administered Tests; and (11) Estimates of Key Statistical Power Parameters. (Contains 37 tables, 3 figures and 45 footnotes.)
National Center for Education Evaluation and Regional Assistance. Available from: ED Pubs. P.O. Box 1398, Jessup, MD 20794-1398. Tel: 877-433-7827; Web site:
Publication Type: Numerical/Quantitative Data; Reports - Research
Education Level: Elementary Secondary Education
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: National Center for Education Evaluation and Regional Assistance (ED)
Identifiers - Location: Arizona; California; Missouri
IES Funded: Yes