مرکز منطقه ای اطلاع رساني علوم و فناوري - Phrase-level transduction model with reordering for spoken to written language transformation

DocumentCode :

3166494

Title :

Phrase-level transduction model with reordering for spoken to written language transformation

Author :

Xu, Ping ; Fung, Pascale ; Chan, Ricky

Author_Institution :

Dept. of Electron. & Comput. Eng., Hong Kong Univ. of Sci. & Technol., Hong Kong, China

fYear :

2012

fDate :

25-30 March 2012

Firstpage :

4965

Lastpage :

4968

Abstract :

This paper proposes a first-ever phrase-level transduction model with reordering to transform colloquial speech directly to written-style transcription. This model is capable of performing n-m transductions. Our transduction model is trained from a parallel corpus of verbatim transcription and written-style transcription. Deletions, substitutions, insertions are well represented using this model. Inversion transduction cases can also be identified and represented. We implement our transduction model using weighted finite-state transducers (WFSTs), and integrate it into a WFST-based speech recognition search space to give both verbatim speaking-style and written-style transcriptions. Evaluations of our model on Cantonese speech to standard written Chinese show 11.59% relative Word Error Rate (WER) reduction over interpolated language model between Cantonese and standard Chinese speech, 5.72% relative WER reduction and 14.82% relative Bilingual Evaluation Understudy (BLEU) improvement over the word-level transduction model.

Keywords :

natural language processing; speech recognition; BLEU; Cantonese speech; Chinese speech; WER reduction; WFST-based speech recognition search space; bilingual evaluation understudy; colloquial speech transform; first-ever phrase-level transduction model; inversion transduction; n-m transductions; verbatim speaking-style transcriptions; verbatim transcription parallel corpus; weighted finite-state transducers; word error rate reduction; word-level transduction model; written language transformation; written-style transcription; Computational modeling; Decoding; Hidden Markov models; Speech; Speech recognition; Standards; Transducers; WFST; phrase-level transduction; reordering; spoken to written language transformation;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on

Conference_Location :

Kyoto

ISSN :

1520-6149

Print_ISBN :

978-1-4673-0045-2

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2012.6289034

Filename :

6289034

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3166494