NotesFAQContact Us
Collection
Advanced
Search Tips
Peer reviewed Peer reviewed
Direct linkDirect link
ERIC Number: EJ845340
Record Type: Journal
Publication Date: 2006
Pages: 34
Abstractor: As Provided
Reference Count: 51
ISBN: N/A
ISSN: ISSN-0023-8309
Spontaneous Speech Events in Two Speech Databases of Human-Computer and Human-Human Dialogs in Spanish
Rodriguez, Luis J.; Torres, M. Ines
Language and Speech, v49 n3 p333-366 2006
Previous works in English have revealed that disfluencies follow regular patterns and that incorporating them into the language model of a speech recognizer leads to lower perplexities and sometimes to a better performance. Although work on disfluency modeling has been applied outside the English community (e.g., in Japanese), as far as we know there is no specific work dealing with disfluencies in Spanish. In this paper, we follow a data driven approach in exploring the potential benefit of modeling disfluencies in a speech recognizer in Spanish. Two databases of human-computer and human-human dialogs are considered, which allow the absolute and relative frequencies of disfluencies in the two situations to be compared. The rate of disfluencies in human-human dialogs is found to be very close to that found for similar databases in English. Due to setup factors, the rate of disfluencies found in human-computer dialogs was remarkably higher than that reported for similar databases in English. In any case, from the point of view of speech recognition, the high frequencies of disfluencies and the distinct features of the acoustic events related to them support the need for explicit acoustic models. The regularities observed in the distribution of filled pauses and speech repairs reveal that including them in the language model of the speech recognizer may be also helpful. The extent to which the number of events depends on utterance length and on the speaker is also explored. Statistics are shown that follow previous studies for English, and a sizeable space is devoted to comparing our results with them. Finally, various possible cues for the automatic detection of speech repairs--a key issue from the point of view of speech understanding--are explored: silent pauses, filled pauses, lengthenings, cut off words and discourse markers. As previously observed for English, none of them was found to be reliable by itself. More information, especially at the acoustic-prosodic level, is no doubt needed to reliably detect speech repairs. (Contains 13 tables and 5 figures.)
SAGE Publications. 2455 Teller Road, Thousand Oaks, CA 91320. Tel: 800-818-7243; Tel: 805-499-9774; Fax: 800-583-2665; e-mail: journals@sagepub.com; Web site: http://sagepub.com
Publication Type: Journal Articles; Reports - Research
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A