NotesFAQContact Us
Search Tips
ERIC Number: ED541947
Record Type: Non-Journal
Publication Date: 2012
Pages: 96
Abstractor: As Provided
Reference Count: 0
ISBN: ISBN-978-1-2672-8278-1
Automatic Domain Adaptation of Word Sense Disambiguation Based on Sublanguage Semantic Schemata Applied to Clinical Narrative
Patterson, Olga
ProQuest LLC, Ph.D. Dissertation, The University of Utah
Domain adaptation of natural language processing systems is challenging because it requires human expertise. While manual effort is effective in creating a high quality knowledge base, it is expensive and time consuming. Clinical text adds another layer of complexity to the task due to privacy and confidentiality restrictions that hinder the ability to share training corpora among different research groups. Semantic ambiguity is a major barrier for effective and accurate concept recognition by natural language processing systems. In my research I propose an automated domain adaptation method that utilizes sub-language semantic schema for all-word word sense disambiguation of clinical narrative. According to the sublanguage theory developed by Zellig Harris, domain-specific language is characterized by a relatively small set of semantic classes that combine into a small number of sentence types. Previous research relied on manual analysis to create language models that could be used for more effective natural language processing. Building on previous semantic type disambiguation research, I propose a method of resolving semantic ambiguity utilizing automatically acquired semantic type disambiguation rules applied on clinical text ambiguously mapped to a standard set of concepts. This research aims to provide an automatic method to acquire Sublanguage Semantic Schema (S3) and apply this model to disambiguate terms that map to more than one concept with different semantic types. The research is conducted using unmodified MetaMap version 2009, a concept recognition system provided by the National Library of Medicine, applied on a large set of clinical text. The project includes creating and comparing models, which are based on unambiguous concept mappings found in seventeen clinical note types. The effectiveness of the final application was validated through a manual review of a subset of processed clinical notes using recall, precision and F-score metrics. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page:]
ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site:
Publication Type: Dissertations/Theses - Doctoral Dissertations
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A