ERIC Number: EJ1236566
Record Type: Journal
Publication Date: 2019
Pages: 22
Abstractor: As Provided
ISBN: N/A
ISSN: ISSN-1934-5747
EISSN: N/A
Gather-Narrow-Extract: A Framework for Studying Local Policy Variation Using Web-Scraping and Natural Language Processing
Journal of Research on Educational Effectiveness, v12 n4 p685-706 2019
Education researchers have traditionally faced severe data limitations in studying local policy variation; administrative data sets capture only a fraction of districts' policy decisions, and it can be expensive to collect more nuanced implementation data from teachers and leaders. Natural language processing and web-scraping techniques can help address these challenges by assisting researchers in locating and processing policy documents located online. School district policies and practices are commonly documented in student and staff manuals, school improvement plans, and meeting minutes that are posted for the public. This article introduces an end-to-end framework for collecting these sorts of policy documents and extracting structured policy data: The researcher gathers all potentially relevant documents from district websites, narrows the text corpus to spans of interest using a text classifier, and then extracts specific policy data using additional natural language processing techniques. Through this framework, a researcher can describe variation in policy implementation at the local level, aggregated across state- or nationwide populations even as policies evolve over time.
Descriptors: Natural Language Processing, Educational Policy, Web Sites, Decision Making, School Districts, Guides, Educational Improvement, Policy Analysis, Computational Linguistics, Classification, Information Retrieval, Dictionaries, Elementary Secondary Education
Routledge. Available from: Taylor & Francis, Ltd. 530 Walnut Street Suite 850, Philadelphia, PA 19106. Tel: 800-354-1420; Tel: 215-625-8900; Fax: 215-207-0050; Web site: http://www.tandf.co.uk/journals
Publication Type: Journal Articles; Reports - Descriptive
Education Level: Elementary Secondary Education
Audience: N/A
Language: English
Sponsor: Institute of Education Sciences (ED)
Authoring Institution: N/A
IES Funded: Yes
Grant or Contract Numbers: R305B140026