DocumentCode :
2307203
Title :
Directory name retrieval over the telephone in the Picasso project
Author :
Neubert, F. ; Gravier, Guillaume ; Yvon, F. ; Chollet, G.
Author_Institution :
Ecole Nat. Superieure des Telecommun., Paris, France
fYear :
1998
fDate :
29-30 Sep 1998
Firstpage :
31
Lastpage :
36
Abstract :
The European project Picasso intends to develop and test several telematics transaction services that will be accessible via the worldwide telephone network. In this framework, ENST works on developing an automated speech recognition system of pronounced and spelled names, for telephone quality speech in French. The recognizer is based on Hidden Markov modeling of speech units using word models for spelled letters and phone models for name pronunciation. Bigram probabilities are introduced at this stage for phonemes and letters, in order to improve the quality of decoding. The directory was built automatically from the list of the names contained in the database, using a grapheme to phoneme converter for the names and rules for spellings, each entry in the directory consisting of several pronunciations and spelling variants. After the acoustic recognition phase, the corresponding entry in the directory is then found using dynamic alignment of symbol sequences, with insertion, deletion and substitution costs determined from the training data to take into account acoustic confusability. As this lexical search is very time consuming for large directories, we present a faster method using pre-selection in a tree-based representation of the lexicon. A rescoring strategy on the 10 best outputs is also evaluated
Keywords :
acoustic signal processing; automatic telephone systems; decoding; grammars; hidden Markov models; probability; speech intelligibility; speech recognition; telephone networks; ENST; European project; French; HMM; Hidden Markov modeling; Picasso project; acoustic confusability; acoustic recognition phase; automated speech recognition system; bigram probabilities; database; decoding quality; deletion; directory name retrieval; dynamic alignment; grapheme to phoneme converter; insertion; letters; lexical search; name pronunciation; phone models; phonemes; pre-selection; rescoring strategy; speech units; spelled letters; spelled names; spelling variants; substitution costs; symbol sequences; telematics transaction services; telephone quality speech; training data; tree-based representation; word models; worldwide telephone network; Automatic speech recognition; Costs; Databases; Decoding; Hidden Markov models; Speech recognition; Telematics; Telephony; Testing; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Interactive Voice Technology for Telecommunications Applications, 1998. IVTTA '98. Proceedings. 1998 IEEE 4th Workshop
Conference_Location :
Torino
Print_ISBN :
0-7803-5028-6
Type :
conf
DOI :
10.1109/IVTTA.1998.727689
Filename :
727689
Link To Document :
بازگشت