DocumentCode :
527796
Title :
Robust character based tagging with domain lexical features for Chinese spoken language understanding
Author :
Bao, Changchun ; Li, Yali ; Li, Ta ; Pan, Jielin ; Yan, Yonghong
Author_Institution :
ThinkIT Lab., Chinese Acad. of Sci., Beijing, China
Volume :
7
fYear :
2010
fDate :
10-12 Aug. 2010
Firstpage :
3410
Lastpage :
3414
Abstract :
Word information is useful in natural language understanding. But in Chinese language processing, word information is not given natural. While word-segmentation works well for text in NLU, it deteriorates Chinese SLU because of the flexibility and distortion of spoken utterance plus ASR errors. This paper propose a novel approach, sub-word features, to take use word information and help understanding spoken utterance while retain the robustness of character-wise processing. By means of this approach, we can also effectively use named entity list to improve SLU performance. Experiments show that the sub-word features give an average of 0.7 improvement for ASR, and the usage of named list given an average of 4.7 improvement.
Keywords :
natural language processing; text analysis; word processing; ASR error; Chinese spoken language; character wise processing; domain lexical feature; robust character based tagging; spoken utterance understanding; sub word feature; word information; word segmentation; Noise; Robustness; Semantics; Speech; Speech recognition; Tagging; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Computation (ICNC), 2010 Sixth International Conference on
Conference_Location :
Yantai, Shandong
Print_ISBN :
978-1-4244-5958-2
Type :
conf
DOI :
10.1109/ICNC.2010.5584353
Filename :
5584353
Link To Document :
بازگشت