Leveraging Automatic Speech Recognition Errors to Detect Challenging Speech Segments in TED Talks.

Mirzaei, Maryam Sadat; Meshgi, Kourosh; Kawahara, Tatsuya

Notes FAQ Contact Us

Back to results

Peer reviewed
PDF on ERIC

Download full text

ERIC Number: ED572207

Record Type: Non-Journal

Publication Date: 2016-Aug

Pages: 7

Abstractor: As Provided

ISBN: N/A

ISSN: N/A

EISSN: N/A

Leveraging Automatic Speech Recognition Errors to Detect Challenging Speech Segments in TED Talks

Mirzaei, Maryam Sadat; Meshgi, Kourosh; Kawahara, Tatsuya

Research-publishing.net, Paper presented at the EUROCALL 2016 Conference (23rd, Limassol, Cyprus, Aug 24-27, 2016)

This study investigates the use of Automatic Speech Recognition (ASR) systems to epitomize second language (L2) listeners' problems in perception of TED talks. ASR-generated transcripts of videos often involve recognition errors, which may indicate difficult segments for L2 listeners. This paper aims to discover the root-causes of the ASR errors and compare them with L2 listeners' transcription mistakes. Our analysis on the ASR errors revealed several categories, such as minimal pairs, homophones, negative cases, and boundary misrecognition, which are assumed to denote the challenging nature of the respective speech segments for L2 listeners. To confirm the usefulness of these categories, we asked L2 learners to watch and transcribe a short segment of TED videos, including the above-mentioned categories of errors. Results revealed that learners' transcription mistakes substantially increase when they transcribe segments of the audio in which ASR made errors. This finding confirmed the potential of using ASR errors as a predictor of L2 learners' difficulties in listening to a particular audio. Furthermore, this study provided us with valuable data to enrich the Partial and Synchronized Caption (PSC) system we proposed earlier to facilitate and promote L2 listening skills. [For the complete volume of short papers, see ED572005.]

Descriptors: Automation, Computer Software, Listening Skills, Error Patterns, Video Technology, Second Language Learning, Comprehension, Foreign Countries, English (Second Language), College Students

Research-publishing.net. La Grange des Noyes, 25110 Voillans, France. e-mail: info@research-publishing.net; Web site: http://research-publishing.net

Publication Type: Speeches/Meeting Papers; Reports - Research

Education Level: Higher Education; Postsecondary Education

Audience: N/A

Language: English

Sponsor: N/A

Authoring Institution: N/A

Identifiers - Location: Japan; China

Grant or Contract Numbers: N/A

Privacy | Copyright | Contact Us | Selection Policy | API