Title :
Robust character based tagging with domain lexical features for Chinese spoken language understanding
Author :
Bao, Changchun ; Li, Yali ; Li, Ta ; Pan, Jielin ; Yan, Yonghong
Author_Institution :
ThinkIT Lab., Chinese Acad. of Sci., Beijing, China
Abstract :
Word information is useful in natural language understanding. But in Chinese language processing, word information is not given natural. While word-segmentation works well for text in NLU, it deteriorates Chinese SLU because of the flexibility and distortion of spoken utterance plus ASR errors. This paper propose a novel approach, sub-word features, to take use word information and help understanding spoken utterance while retain the robustness of character-wise processing. By means of this approach, we can also effectively use named entity list to improve SLU performance. Experiments show that the sub-word features give an average of 0.7 improvement for ASR, and the usage of named list given an average of 4.7 improvement.
Keywords :
natural language processing; text analysis; word processing; ASR error; Chinese spoken language; character wise processing; domain lexical feature; robust character based tagging; spoken utterance understanding; sub word feature; word information; word segmentation; Noise; Robustness; Semantics; Speech; Speech recognition; Tagging; Training data;
Conference_Titel :
Natural Computation (ICNC), 2010 Sixth International Conference on
Conference_Location :
Yantai, Shandong
Print_ISBN :
978-1-4244-5958-2
DOI :
10.1109/ICNC.2010.5584353