DocumentCode :
3166494
Title :
Phrase-level transduction model with reordering for spoken to written language transformation
Author :
Xu, Ping ; Fung, Pascale ; Chan, Ricky
Author_Institution :
Dept. of Electron. & Comput. Eng., Hong Kong Univ. of Sci. & Technol., Hong Kong, China
fYear :
2012
fDate :
25-30 March 2012
Firstpage :
4965
Lastpage :
4968
Abstract :
This paper proposes a first-ever phrase-level transduction model with reordering to transform colloquial speech directly to written-style transcription. This model is capable of performing n-m transductions. Our transduction model is trained from a parallel corpus of verbatim transcription and written-style transcription. Deletions, substitutions, insertions are well represented using this model. Inversion transduction cases can also be identified and represented. We implement our transduction model using weighted finite-state transducers (WFSTs), and integrate it into a WFST-based speech recognition search space to give both verbatim speaking-style and written-style transcriptions. Evaluations of our model on Cantonese speech to standard written Chinese show 11.59% relative Word Error Rate (WER) reduction over interpolated language model between Cantonese and standard Chinese speech, 5.72% relative WER reduction and 14.82% relative Bilingual Evaluation Understudy (BLEU) improvement over the word-level transduction model.
Keywords :
natural language processing; speech recognition; BLEU; Cantonese speech; Chinese speech; WER reduction; WFST-based speech recognition search space; bilingual evaluation understudy; colloquial speech transform; first-ever phrase-level transduction model; inversion transduction; n-m transductions; verbatim speaking-style transcriptions; verbatim transcription parallel corpus; weighted finite-state transducers; word error rate reduction; word-level transduction model; written language transformation; written-style transcription; Computational modeling; Decoding; Hidden Markov models; Speech; Speech recognition; Standards; Transducers; WFST; phrase-level transduction; reordering; spoken to written language transformation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
ISSN :
1520-6149
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2012.6289034
Filename :
6289034
Link To Document :
بازگشت