NotesFAQContact Us
Search Tips
ERIC Number: ED546564
Record Type: Non-Journal
Publication Date: 2012
Pages: 121
Abstractor: As Provided
Reference Count: N/A
ISBN: 978-1-2676-0221-3
Learning Pronunciations from Unlabeled Evidence
Reddy, Sravana
ProQuest LLC, Ph.D. Dissertation, University of Chicago
The pronunciation of a word represented in an alphabetic writing system (such as this one) is relatively transparent--but a language's sounds change over time and vary across space, while its spellings remain relatively static, resulting in some amount of divergence between the written and spoken forms. The introduction of loanwords and proper names from other languages with different phonologies and scripts further complicates the relationship between orthography and pronunciation. However, there are sources of information about the sound of a word in addition to spelling. This dissertation presents methods for learning pronunciations using three forms of evidence: speech, word origins, and rhymes. Speech is the most natural source of evidence: a word's pronunciation is usually clarified upon hearing it in a spoken utterance. In the case of proper names, knowing the linguistic or ethnic origin of the name is often instrumental in determining how it should be pronounced. Rhymes in poems or songs also provide a cue to pronunciation, particularly for inferring the sound of a word at an earlier point in history. Using extra-orthographic evidence in a computational model necessitates access to data that provides this information in sufficiently large quantities. Collecting annotated data--speech labeled with the words that it contains, names with their ethnic origins, or poetry with rhyming patterns--can be extremely difficult and expensive. On the other hand, raw data--speech recordings, lists of names, archives of poetry--is available in plenty. The focus of this dissertation is, therefore, on using "unlabeled" data for pronunciation learning. There are two facets of "pronunciation learning" in this work. One is that of converting the written form of a word to its phonemic representation, known as grapheme-to-phoneme conversion. The other is a less fine-grained objective: rather than learning the exact phonemic forms, we aim to discover clusters of words with similar pronunciations. The first contribution of this dissertation is a model to incorporate untranscribed spoken data, in addition to an existing lexicon of words and their pronunciations, into a grapheme-to-phoneme learner. This approach is found to improve over models that are trained only on a lexicon. The second contribution is an algorithm that models the latent linguistic origins of names as part of grapheme-to-phoneme conversion. Not only does this method do better than a model without word origin awareness, it also outperforms existing approaches that first classify orthographic words by origin in a supervised manner, and then train language-specific pronunciation models. The final contribution is an unsupervised learner that discovers the rhyme schemes of stanzas of poetry, as well as rhyming relations between words and clusters of words with a common rhyming sound. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page:]
ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site:
Publication Type: Dissertations/Theses - Doctoral Dissertations
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A