DocumentCode :
2358050
Title :
Learning name pronunciations in automatic speech recognition systems
Author :
Beaufays, Françoise ; Sankar, Ananth ; Williams, Shaun ; Weintraub, Mitch
Author_Institution :
Nuance Commun., Menlo Park, CA, USA
fYear :
2003
fDate :
3-5 Nov. 2003
Firstpage :
233
Lastpage :
240
Abstract :
Many speech recognition systems that provide over-the phone services, e.g. name dialers, stock quote providers, location finders, rely on the accurate recognition of proper names. For this to happen, the systems need to know how their users will pronounce these words. However, predicting the pronunciation of a proper name is a notoriously difficult problem as it depends on the origin of the name, the linguistic background of the speaker, and other cultural and sociological factors, in addition of course to the word spelling. In this paper, we describe a data-driven method that learns proper name pronunciations from audio samples of these words. The algorithm relies on the machinery of a general purpose speech recognizer to find the phone sequence that best matches the sample speech waveforms. In addition, it incorporates linguistic knowledge automatically acquired from a pronunciation dictionary to ensure that the learned pronunciations are "reasonable" from a linguistic viewpoint. We show on a corporate name dialing database that the proposed algorithm reduces the call routing error rate by 40% compared to a reference letter-to-phone pronunciation engine.
Keywords :
error analysis; learning (artificial intelligence); linguistics; speech recognition; automatic speech recognition; call routing error; corporate name dialing database; data-driven method; letter-to-phone pronunciation engine; linguistic knowledge; location finder; name dialer; name pronunciation; over-the-phone services; phone sequence; pronounciation dictionary; pronunciation dictionary; proper name pronunciation; proper name recognition; speech waveform; stock quote provider; word spelling; Automatic speech recognition; Cultural differences; Databases; Dictionaries; Engines; Error analysis; Hidden Markov models; Machinery; Routing; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Tools with Artificial Intelligence, 2003. Proceedings. 15th IEEE International Conference on
ISSN :
1082-3409
Print_ISBN :
0-7695-2038-3
Type :
conf
DOI :
10.1109/TAI.2003.1250196
Filename :
1250196
Link To Document :
بازگشت