Title :
Neural networks for text-to-speech phoneme recognition
Author :
Embrechts, Mark J. ; Arciniegas, Fabio
Author_Institution :
Dept. of Decision Sci. & Eng. Syst., Rensselaer Polytech. Inst., Troy, NY, USA
Abstract :
Presents two different artificial neural network (ANN) approaches for phoneme recognition for text-to-speech applications: staged backpropagation neural networks and self-organizing maps. Several current commercial approaches rely on an exhaustive dictionary approach for text-to-phoneme conversion. Applying neural networks to phoneme mapping for text-to-speech conversion creates a fast distributed recognition engine. This engine not only supports the mapping of missing words in the database, but it can also mitigate contradictions related to different pronunciations for the same word. The ANNs presented in this work were trained based on the 2,000 most common words in American English. Performance metrics for the 5,000, 7,000 and 10,000 most common words in English were also estimated to test the robustness of these neural networks
Keywords :
backpropagation; dictionaries; feedforward neural nets; pattern recognition; performance index; self-organising feature maps; speech synthesis; text analysis; American English; distributed recognition engine; exhaustive dictionary approach; missing words; performance metrics; phoneme mapping; pronunciation contradictions; robustness; self-organizing maps; staged backpropagation neural networks; text-to-speech phoneme recognition; Artificial neural networks; Backpropagation; Databases; Dictionaries; Engines; Measurement; Neural networks; Self organizing feature maps; Speech synthesis; Testing;
Conference_Titel :
Systems, Man, and Cybernetics, 2000 IEEE International Conference on
Conference_Location :
Nashville, TN
Print_ISBN :
0-7803-6583-6
DOI :
10.1109/ICSMC.2000.886565