DocumentCode
3718116
Title
A new text representation model enriched with semantic relations
Author
Aliya Nugumanova;Yerzhan Baiburin;Kurmash Apaev
Author_Institution
Department of Information Technologies, East Kazakhstan State Technical University, Ust-Kamenogorsk, Kazakhstan
fYear
2015
Firstpage
619
Lastpage
622
Abstract
In this paper we present a novel approach based on efficient text representation which employs semantic relations between words. We use singular value decomposition of the co-occurrence matrix to overcome its noise and sparseness. Thereby, we obtain a new refined co-occurrence matrix, which allows us to determine relations between words as distances in it. We use these distances as correction factors for the Bag-of-words text representation. In other words, we transform text representation vectors by inclusion relations between words. To validate our representation model, we apply it to binary classification task. We study how our model improves classification of documents, which are relevant to a given domain (topic). For this purpose, we implement Support Vector Machine and classify documents from Reuters-21578 collection. Results of our experiments demonstrate the superiority of our model.
Keywords
"Matrix decomposition","Marine vehicles","Sugar","Support vector machines","Data models"
Publisher
ieee
Conference_Titel
Control, Automation and Systems (ICCAS), 2015 15th International Conference on
ISSN
2093-7121
Type
conf
DOI
10.1109/ICCAS.2015.7364992
Filename
7364992
Link To Document