مرکز منطقه ای اطلاع رساني علوم و فناوري - A study on cross-language knowledge integration in Mandarin LVCSR

DocumentCode :

3125164

Title :

A study on cross-language knowledge integration in Mandarin LVCSR

Author :

Chen-Yu Chiang ; Siniscalchi, Sabato Marco ; Yih-Ru Wang ; Sin-Horng Chen ; Chin-Hui Lee

Author_Institution :

Dept. of Electr. Eng., Nat. Chiao Tung Univ., Hsinchu, Taiwan

fYear :

2012

fDate :

5-8 Dec. 2012

Firstpage :

315

Lastpage :

319

Abstract :

We present a cross-language knowledge integration framework to improve the performance in large vocabulary continuous speech recognition. Two types of knowledge sources, manner attribute and prosodic structure, are incorporated. For manner of articulation, cross-lingual attribute detectors trained with an American English corpus (WSJ0) are utilized to verify and rescore hypothesized Mandarin syllables in word lattices obtained with state-of-the-art systems. For the prosodic structure, models trained with an unsupervised joint prosody labeling and modeling technique using a Mandarin corpus (TCC300) are used in lattice rescoring. Experimental results on Mandarin syllable, character and word recognition with the TCC300 corpus show that the proposed approach significantly outperforms the baseline system that does not use articulatory and prosodic information. It also demonstrates a potential of utilizing results from cross-lingual attribute detectors as a language-universal frontend for automatic speech recognition.

Keywords :

character recognition; speech; speech recognition; American English corpus; Mandarin LVCSR; Mandarin corpus; TCC300 corpus; WSJ0 corpus; articulation manner; automatic speech recognition; character recognition; cross-language knowledge integration framework; cross-lingual attribute detectors; hypothesized Mandarin syllables; knowledge sources; language-universal frontend; large vocabulary continuous speech recognition; lattice rescoring; manner attribute; prosodic structure; unsupervised joint prosody labeling and modeling technique; word lattices; word recognition; Acoustics; Detectors; Hidden Markov models; Lattices; Pragmatics; Speech; Speech recognition; LVCSR; attribute detector; knowledge integration; prosody modeling;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on

Conference_Location :

Kowloon

Print_ISBN :

978-1-4673-2506-6

Electronic_ISBN :

978-1-4673-2505-9

Type :

conf

DOI :

10.1109/ISCSLP.2012.6423528

Filename :

6423528

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3125164