NotesFAQContact Us
Search Tips
ERIC Number: ED537867
Record Type: Non-Journal
Publication Date: 2011
Pages: 146
Abstractor: As Provided
Reference Count: 0
ISBN: ISBN-978-1-2671-4081-4
A CCG-Based Method for Training a Semantic Role Labeler in the Absence of Explicit Syntactic Training Data
Boxwell, Stephen A.
ProQuest LLC, Ph.D. Dissertation, The Ohio State University
Treebanks are a necessary prerequisite for many NLP tasks, including, but not limited to, semantic role labeling. For many languages, however, treebanks are either nonexistent or too small to be useful. Time-critical applications may require rapid deployment of natural language software for a new critical language--much faster than the development time of a traditional treebank. This dissertation describes a method for generating a treebank and training syntactic and semantic models using only semantic training information--that is, no human-annotated syntactic training data whatsoever. This will greatly increase the speed of development of natural language tools for new critical languages in exchange for a modest drop in overall accuracy. Using Combinatory Categorial Grammar (CCG) in concert with Propbank semantic role annotations allows us to accurately predict lexical categories in combination with a partially hidden Markov model. By training the Berkeley parser on our generated syntactic data, we can achieve SRL performance of 65.5% without using a treebank, as opposed to 74% using the same feature set with gold-standard data. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page:]
ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site:
Publication Type: Dissertations/Theses - Doctoral Dissertations
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A