ERIC Number: ED345295
Record Type: Non-Journal
Publication Date: 1992-Mar-26
Reference Count: N/A
Neural Network Classifier Architectures for Phoneme Recognition. CRC Technical Note No. CRC-TN-92-001.
A study applied artificial neural networks, trained with the back-propagation learning algorithm, to modelling phonemes extracted from the DARPA TIMIT multi-speaker, continuous speech data base. A number of proposed network architectures were applied to the phoneme classification task, ranging from the simple feedforward multilayer network to more complex modular architectures which attempt to assign classifier modules to different regions of the input space. Results showed that, in general, modular architectures that attempt to identify regions of input space could not perform any better than a single network trained to handle the whole input space. Two network structures learned to classify to some degree the total phoneme space: a single multilayer network, and a novel architecture trained to discriminate among 38 classes and employing a number of TDNN modules to map the input space to a different representation of reduced dimensionality which was used by a multilayer network to discriminate among the classes. This new representation enabled improved recognition of silence intervals, and performed almost as well as the simple multilayer network classifier at recognizing tokens from the other classes. Because of their similar abilities to identify phoneme tokens, the choice of network to use depends on other factors. (Seven figures and 12 tables of data are included; 36 references are attached.) (Author/SR)
Communications Research Centre, 3701 Carling Ave., P.O. Box 11490, Station H, Ottawa, Ontario K2H 8S2, Canada.
Publication Type: Reports - Research
Education Level: N/A
Authoring Institution: Department of Communications, Ottawa (Ontario). Communications Research Centre.