DocumentCode :
3718116
Title :
A new text representation model enriched with semantic relations
Author :
Aliya Nugumanova;Yerzhan Baiburin;Kurmash Apaev
Author_Institution :
Department of Information Technologies, East Kazakhstan State Technical University, Ust-Kamenogorsk, Kazakhstan
fYear :
2015
Firstpage :
619
Lastpage :
622
Abstract :
In this paper we present a novel approach based on efficient text representation which employs semantic relations between words. We use singular value decomposition of the co-occurrence matrix to overcome its noise and sparseness. Thereby, we obtain a new refined co-occurrence matrix, which allows us to determine relations between words as distances in it. We use these distances as correction factors for the Bag-of-words text representation. In other words, we transform text representation vectors by inclusion relations between words. To validate our representation model, we apply it to binary classification task. We study how our model improves classification of documents, which are relevant to a given domain (topic). For this purpose, we implement Support Vector Machine and classify documents from Reuters-21578 collection. Results of our experiments demonstrate the superiority of our model.
Keywords :
"Matrix decomposition","Marine vehicles","Sugar","Support vector machines","Data models"
Publisher :
ieee
Conference_Titel :
Control, Automation and Systems (ICCAS), 2015 15th International Conference on
ISSN :
2093-7121
Type :
conf
DOI :
10.1109/ICCAS.2015.7364992
Filename :
7364992
Link To Document :
بازگشت