ERIC Number: EJ1247317
Record Type: Journal
Publication Date: 2020-Apr
Pages: 24
Abstractor: As Provided
ISBN: N/A
ISSN: ISSN-1076-9986
EISSN: N/A
Variable Selection for Causal Effect Estimation: Nonparametric Conditional Independence Testing with Random Forests
Keller, Bryan
Journal of Educational and Behavioral Statistics, v45 n2 p119-142 Apr 2020
Widespread availability of rich educational databases facilitates the use of conditioning strategies to estimate causal effects with nonexperimental data. With dozens, hundreds, or more potential predictors, variable selection can be useful for practical reasons related to communicating results and for statistical reasons related to improving the efficiency of estimators. Background knowledge should take precedence in deciding which variables to retain. However, with many potential predictors, theory may be weak, such that functional form relationships are likely to be unknown. In this article, I propose a nonparametric method for data-driven variable selection based on permutation testing with conditional random forest variable importance. The algorithm automatically handles nonlinear relationships and interactions in its naive implementation. Through a series of Monte Carlo simulation studies and a case study with Early Childhood Longitudinal Study--K data, I find that the method performs well across a variety of scenarios where other methods fail.
Descriptors: Nonparametric Statistics, Computation, Testing, Causal Models, Statistical Inference, Statistical Analysis, Monte Carlo Methods, Children, Longitudinal Studies, Surveys
SAGE Publications. 2455 Teller Road, Thousand Oaks, CA 91320. Tel: 800-818-7243; Tel: 805-499-9774; Fax: 800-583-2665; e-mail: journals@sagepub.com; Web site: http://sagepub.com
Publication Type: Journal Articles; Reports - Research
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Identifiers - Assessments and Surveys: Early Childhood Longitudinal Survey
Grant or Contract Numbers: N/A