Title :
A hierarchical structure for modeling inter and intra phonetic information for phoneme recognition
Author :
Vasquez, Daniel ; Aradilla, Guillermo ; Gruhn, Rainer ; Minker, Wolfgang
Author_Institution :
Inst. of Inf. Technol., Univ. of Ulm, Ulm, Germany
fDate :
Nov. 13 2009-Dec. 17 2009
Abstract :
In this paper, we present a two-layer hierarchical structure based on neural networks for phoneme recognition. The proposed structure attempts to model only the characteristics within a phoneme, i.e., intra-phonetic information. This differs from other state-of-the-art hierarchical structures where the first layer typically models the intra-phonetic information while the second layer focuses on modeling the contextual (inter-phonetic) information. An advantage of the proposed model is that it can be added to another layer that focuses on the inter-phonetic information. In this paper, we also show that the categorization between intra- and inter-phonetic information also allows to extend other state-of-the-art hierarchical approaches. A phoneme accuracy of 77.89% is achieved on the TIMIT database, which compares favorably to the best results obtained on this database.
Keywords :
neural nets; speech recognition; TIMIT database; contextual information; interphonetic information modeling; intraphonetic information modeling; neural networks; phoneme recognition; two-layer hierarchical structure; Automatic speech recognition; Cepstral analysis; Context modeling; Databases; Decoding; Frequency; Hidden Markov models; Information technology; Speech recognition; Viterbi algorithm;
Conference_Titel :
Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on
Conference_Location :
Merano
Print_ISBN :
978-1-4244-5478-5
Electronic_ISBN :
978-1-4244-5479-2
DOI :
10.1109/ASRU.2009.5373272