DocumentCode :
323800
Title :
Speech recognition performance on a voicemail transcription task
Author :
Padmanabhan, M. ; Eide, E. ; Ramabhadran, Bhuvana ; Ramaswamy, Ganesh ; Bahl, L.R.
Author_Institution :
IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
Volume :
2
fYear :
1998
fDate :
12-15 May 1998
Firstpage :
913
Abstract :
We describe a new testbed for developing speech recognition algorithms-the ARRPA-sponsored voicemail transcription task, analogous to other tasks such as the Switchboard, CallHome and the Hub 4 tasks. The task involves the transcription of voicemail conversations. Voicemail represents a very large volume of real-world speech data, which is however not particularly well represented in existing databases. For instance, the Switchboard and CallHome databases contain telephone conversations between two humans, representing telephone-bandwidth spontaneous speech; the Hub 4 database contains radio broadcasts which represents different kinds of speech data such as spontaneous speech from a well-trained speaker, conversations between two humans possibly over the telephone, etc. The voicemail database on the other hand also represents telephone bandwidth spontaneous speech, however the difference with respect to the Switchboard and CallHome tasks is that the interaction is not between two humans, but rather between a human and a machine-consequently, the speech is expected to be a little more formal in its nature, without the problems of crosstalk, barge-in etc. This eliminates some of the variables and provides more controlled conditions enabling one to concentrate on the aspects of spontaneous speech and effects of the telephone channel. We describe the modality of collection of the speech data, and some algorithmic techniques that were devised based on this data. We also describe the initial results of the transcription performance on this task
Keywords :
acoustic signal processing; natural languages; speech recognition; speech synthesis; voice mail; ARRPA; CallHome database; Hub 4 task; Switchboard database; acoustic models; algorithmic techniques; language model; radio broadcasts; real-world speech data; speech data collection; speech recognition algorithms; speech recognition performance; telephone channel; telephone-bandwidth spontaneous speech; transcription performance; voicemail database; voicemail transcription; Bandwidth; Data privacy; Databases; Humans; Radio broadcasting; Speech recognition; Telephony; Testing; Voice mail;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
Conference_Location :
Seattle, WA
ISSN :
1520-6149
Print_ISBN :
0-7803-4428-6
Type :
conf
DOI :
10.1109/ICASSP.1998.675414
Filename :
675414
Link To Document :
بازگشت