DocumentCode :
173470
Title :
Improving relation descriptor extraction with word embeddings and cluster features
Author :
Tao Liu ; Minghui Li
Author_Institution :
Sch. of Inf., Renmin Univ. of China, Beijing, China
fYear :
2014
fDate :
5-8 Oct. 2014
Firstpage :
1271
Lastpage :
1275
Abstract :
Relation descriptor is the text string which best describes the pre-defined relation between two entities. Relation descriptor can help people to know the specific semantics between two entities, which is very meaningful for knowledge base construction. Traditional relation descriptor extraction method use nominal features, whose expressibility is limited. Word embeddings via deep learning technology can reflect more syntactic and semantic information of words. In this paper we introduce the word embeddings to relation descriptor extraction problem. In order to obtain word semantic classes, we cluster words based on word embeddings and adopt the word cluster feature. Experimental results have shown that word embeddings feature and word cluster feature can improve the performance of relation descriptor extraction obviously. Furthermore the word cluster feature is more robust than word embeddings feature on the relation descriptor extraction. The best method can save 44% and 33% training data to achieve the same performance as the basic method on two datasets.
Keywords :
feature extraction; information retrieval; learning (artificial intelligence); pattern clustering; text analysis; vocabulary; word processing; deep learning technology; relation descriptor extraction method; semantic information; syntactic information; text string; word cluster features; word embeddings; Electronic publishing; Encyclopedias; Feature extraction; Internet; Semantics; Training data; deep learning; relation descriptor extraction; word cluster; word embeddings;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man and Cybernetics (SMC), 2014 IEEE International Conference on
Conference_Location :
San Diego, CA
Type :
conf
DOI :
10.1109/SMC.2014.6974089
Filename :
6974089
Link To Document :
بازگشت