NotesFAQContact Us
Search Tips
ERIC Number: ED513853
Record Type: Non-Journal
Publication Date: 2009
Pages: 163
Abstractor: As Provided
Reference Count: 0
ISBN: ISBN-978-1-1096-2239-3
Issues in Estimating Program Effects and Studying Implementation in Large-Scale Educational Experiments: The Case of a Connected Classroom Technology Program
Shin, Hye Sook
ProQuest LLC, Ph.D. Dissertation, University of California, Los Angeles
Using data from a nationwide, large-scale experimental study of the effects of a connected classroom technology on student learning in algebra (Owens et al., 2004), this dissertation focuses on challenges that can arise in estimating treatment effects in educational field experiments when samples are highly heterogeneous in terms of various factors related to outcomes of interest (e.g., class mean prior achievement, class grade level), and on sensible ways of proceeding in estimating treatment effects in such situations. In addition, I focus on problems that arise in measuring implementation, and how heterogeneous samples can complicate interpretations of results from analyses involving implementation data. In general, estimates of overall treatment effects can mask important interactions that may be at work, and this is especially so when samples are highly heterogeneous. In addition, if classes are not stratified with respect to key class and school characteristics prior to random assignment (e.g., grade-level, type of geographic locale), imbalances can arise that can bias estimates of treatment effects. Furthermore, analyses based on a highly heterogeneous sample can potentially yield results that are artifacts of working with a sample that consists of very disparate subsamples. To help overcome these challenges, I employed a strategy that entails estimating treatment effects for various subgroups of classes defined by grade and by certain grade/geographic location combinations. The results help highlight how estimates based on highly heterogeneous samples can be misleading. Based on the whole sample of 80 classes, which consists of 8th, 9th and 10th grade classes, I obtained a treatment effect estimate of 1.32 (SE=0.64), which was statistically significant, though not necessarily of practical significance given the standard deviation of posttest scores was a little over 7 points. Focusing on the sample of 9th grade classes, I obtained a treatment effect estimate of only 0.47 points (SE=1.51), which was clearly not statistically significant. It is important to note that the sample of 9th grade classes, like the entire sample of classes, is highly heterogeneous in terms of class-mean pretest scores, location and various school-level demographic characteristics, I then focused on the subsample of classes in suburban areas and small towns. These classes tend to be more advantaged in terms of SES and pretest scores compared with the other 9th grade classes, and also more stable (i.e., classes containing higher proportions of students with both pretest and posttest scores). The estimate of the treatment effect for this subgroup was fairly substantial (i.e., 3.44 points (SE=1.47)) and was statistically significant. I then attempted to estimate the effect of TM for 8th grade classes. In doing this I was constrained by the fact that there were 11 treatment and 4 control group 8th grade classes. As such I formed 4 matched pairs of treatment and control classes, and in an HLM analysis obtained an overall treatment effect estimate of 4.74 (SE=1.64), which is both statistically significant and of practical significance. Note that the analysis also showed that the effect of TI varied substantially across pairs. With respect to implementation, Shin et al. (2008) used data from the Owens et al. study to study how differences in implementation of TI Navigator relate to differences in algebra posttest scores. However, Shin et al. did not use available data on which specific component(s) of TI each treatment teacher used. In my dissertation I employed data on component use as well as the measures in Shin et al. The results showed that class-mean pretest scores and certain grade/location categories were strongly related to whether or not certain components were used, making it impossible to disentangle the effects of component use from the effects of key classroom compositional characteristics on outcomes. This points to the need for more fine-grained measures that capture the quality of component use and how they other integrated with other instructional activities, as well as larger numbers of teachers in subgroups of interest. Implications based on the analyses presented in this dissertation for the design of educational experiments and measurement of implementation are discussed in the final chapter, and include the need to target key subgroups of interest in the planning phase of a study, recruit sufficient numbers of teachers and classes within each subgroup, and randomly assign teachers within each subgroup to treatment or control conditions. Furthermore, detailed, in-depth data on program use needs to be collected through class observation and teacher logs. In addition, rather than having a sample of teachers located in many different states and districts, it is more sensible to focus on a relatively small number of well-chosen sites (i.e. districts). (Abstract shortened by UMI.) [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page:]
ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site:
Publication Type: Dissertations/Theses - Doctoral Dissertations
Education Level: Grade 8; Grade 9
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A