DocumentCode :
1796921
Title :
Sentence boundary detection in chinese broadcast news using conditional random fields and prosodic features
Author :
Chenglin Xu ; Lei Xie ; Zhonghua Fu
Author_Institution :
Shaanxi Provincial Key Lab. of Speech & Image Inf. Process., Northwestern Polytech. Univ., Xi´an, China
fYear :
2014
fDate :
9-13 July 2014
Firstpage :
37
Lastpage :
41
Abstract :
This paper studies the use of condition random fields (CRF) and prosodic features for sentence boundary detection in Chinese broadcast news. Previous approaches mostly use first-order CRF and ignore the important context and sequential information. In this paper, we explore high-order CRF models to fully make use of the contextual and sequential information. Moreover, we show the effectiveness of CRF in sentence boundary detection by comparing it with various competitive models. The prosodic feature set is usually designed to be as exhaustive as possible in previous approaches. As a result, features may be highly correlated and some of them may be not effective. In this paper, we use a correlation-based feature selection method to select a subset with the most useful features. Finally, the use of the prosodic features, e.g., pitch, in Chinese sentence segmentation deserves further investigation because the tonal aspect of Chinese may complicate the expressions of pitch features. In this paper, we study the effectiveness of the prosodic features and rank their importance by an analysis of feature usage.
Keywords :
natural language processing; speech recognition; CRF; Chinese broadcast news; Chinese sentence segmentation; conditional random fields; context information; correlation-based feature selection method; feature usage analysis; prosodic features; sentence boundary detection; sequential information; Context; Correlation; Feature extraction; Hidden Markov models; Niobium; Speech; Support vector machines; conditional random field; feature selection; sentence boundary detection; sentence segmentation; speech prosody;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal and Information Processing (ChinaSIP), 2014 IEEE China Summit & International Conference on
Conference_Location :
Xi´an
Print_ISBN :
978-1-4799-5401-8
Type :
conf
DOI :
10.1109/ChinaSIP.2014.6889197
Filename :
6889197
Link To Document :
بازگشت