مرکز منطقه ای اطلاع رساني علوم و فناوري - A comparative study on system combination schemes for LVCSR

DocumentCode :

2800155

Title :

A comparative study on system combination schemes for LVCSR

Author :

Ma, Chengyuan ; Kuo, Hong-Kwang Jeff ; Soltau, Hagen ; Cui, Xiaodong ; Chaudhari, Upendra ; Mangu, Lidia ; Lee, Chin-Hui

Author_Institution :

Sch. of ECE, Georgia Inst. of Technol., Atlanta, GA, USA

fYear :

2010

fDate :

14-19 March 2010

Firstpage :

4394

Lastpage :

4397

Abstract :

We present a comparative study on combination schemes for large vocabulary continuous speech recognition by incorporating long-span class posterior probability features into conventional short-time cepstral features. System combination can improve the overall speech recognition performance when multiple systems exhibit different error patterns and multiple knowledge sources encode complementary information. A variety of combination approaches are investigated in this paper, e.g., feature concatenation single stream system, model combination multi-stream system, lattice rescoring and ROVER. These techniques work at different levels of a LVCSR system and have different computational cost. We compared their performance and analyzed their advantages and disadvantages on large vocabulary English broadcast news transcription tasks. Experimental results showed that model combination with independent tree consistently outperforms ROVER, feature concatenation and lattice rescoring. In addition, the phoneme posterior probability features do provide complementary information to short-time cepstral features.

Keywords :

feature extraction; probability; speech recognition; vocabulary; LVCSR; ROVER; cepstral features; error patterns; feature concatenation; independent tree; large vocabulary continuous speech recognition; lattice rescoring; long-span class posterior probability; multiple knowledge sources encode; system combination schemes; Automatic speech recognition; Broadcasting; Cepstral analysis; Hidden Markov models; Lattices; Linear discriminant analysis; Mel frequency cepstral coefficient; Speech recognition; Vectors; Vocabulary; ROVER; feature concatenation; lattice rescoring; model combination; multi-stream; system combination;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on

Conference_Location :

Dallas, TX

ISSN :

1520-6149

Print_ISBN :

978-1-4244-4295-9

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2010.5495627

Filename :

5495627

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2800155