DocumentCode
686514
Title
Auto-clustering of conversation corpus based on syntactic, semantic and pragmatic features
Author
Baojian Chen ; Minghu Jiang
Author_Institution
Sch. of Humanities, Tsinghua Univ., Beijing, China
fYear
2013
fDate
22-258 Nov. 2013
Firstpage
295
Lastpage
300
Abstract
To understand natural language accurately, we not only need to do natural language morphology and syntactic analysis, but also need to combine semantic knowledge and pragmatic information with a specific context. Due to short knowledge and lack in background information of conversation corpus which related to the pragmatic, there is a long way to go for computer fully understand natural language. In this paper, the pragmatic features were added to the text vector space model of language spoken conversation, and hierarchical clustering is executed. Our experimental results show that the clustering effect with pragmatic features outperforms than non-pragmatic features, and precision, recall rate and F values of the former were increased by 6.67%, 6.34% and 6.6%, respectively. It indicates that pragmatic information has played an important role in enhancing the effect of the text clustering.
Keywords
natural language processing; pattern clustering; programming language semantics; text analysis; conversation corpus auto-clustering; hierarchical clustering; language spoken conversation; natural language morphology; pragmatic feature; pragmatic information; semantic feature; semantic knowledge; syntactic analysis; syntactic feature; text clustering effect enhancement; text vector space model; hierarchical clustering; pragmatic features; text vector space mode;
fLanguage
English
Publisher
iet
Conference_Titel
Wireless, Mobile and Multimedia Networks (ICWMMN 2013), 5th IET International Conference on
Conference_Location
Beijing
Electronic_ISBN
978-1-84919-726-7
Type
conf
DOI
10.1049/cp.2013.2428
Filename
6827845
Link To Document