Title :
Heterogeneous lexical units for automatic speech recognition: preliminary investigations
Author :
Bazzi, Issam ; Glass, James
Author_Institution :
Lab. for Comput. Sci., MIT, Cambridge, MA, USA
Abstract :
This paper explores the use of the phone and syllable as primary units of representation in the first stage of a two-stage recognizer. A finite-state transducer speech recognizer is utilized to configure the recognition as a two-stage process, where either phone or syllable graphs are computed in the first stage, and passed to the second stage to determine the most likely word hypotheses. Preliminary experiments in a weather information speech understanding domain show that a syllable representation with either bigram or trigram language models provides more constraint than a phonetic representation with a higher-order n-gram language model (up to a 6-gram), and approaches the performance of a more conventional single-stage word-based configuration
Keywords :
graph theory; speech recognition; automatic speech recognition; bigram language model; finite-state transducer speech recognizer; graphs; heterogeneous lexical units; performance; phone; representation; syllable; trigram language model; two-stage process; two-stage recognizer; weather information speech understanding domain; word hypotheses; Automatic speech recognition; Computer science; Glass; Information systems; Laboratories; Natural languages; Speech processing; Speech recognition; Transducers; Vocabulary;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
Conference_Location :
Istanbul
Print_ISBN :
0-7803-6293-4
DOI :
10.1109/ICASSP.2000.861804