مرکز منطقه ای اطلاع رساني علوم و فناوري - Generating and evaluating segmentations for automatic speech recognition of conversational telephone speech

DocumentCode :

417249

Title :

Generating and evaluating segmentations for automatic speech recognition of conversational telephone speech

Author :

Tranter, S.E. ; Yu, K. ; Everinann, G. ; Woodland, P.C.

Author_Institution :

Dept. of Eng., Cambridge Univ., UK

Volume :

fYear :

2004

fDate :

17-21 May 2004

Lastpage :

753

Abstract :

Speech recognition systems for conversational telephone speech require the audio data to be automatically divided into regions of speech and non-speech. The quality of this audio segmentation affects the recognition accuracy. This paper describes several approaches to segmentation and compares the resulting recogniser performance. It is shown that using Gaussian mixture models outperforms an energy-detection method and using the output from the speech recogniser itself increases performance further. An upper bound on possible performance was obtained when deriving a segmentation from a forced alignment of the reference words and this outperformed using manually marked word times. Finally the correlation between an appropriately defined segmentation score and WER is shown to be over 0.95 across three data sets, suggesting that segmentations can be evaluated directly without the need for full decoding runs.

Keywords :

Gaussian distribution; error statistics; speech recognition; Gaussian mixture models; WER; audio segmentation; automatic speech recognition; conversational telephone speech; recogniser performance; recognition accuracy; upper bound; Automatic speech recognition; Data engineering; Decoding; Error analysis; Intrusion detection; Speech analysis; Speech recognition; Telephony; Timing; Upper bound;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on

Conference_Location :

Montreal, Que.

ISSN :

1520-6149

Print_ISBN :

0-7803-8484-9

Type :

conf

DOI :

10.1109/ICASSP.2004.1326095

Filename :

1326095

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=417249