• DocumentCode
    3718116
  • Title

    A new text representation model enriched with semantic relations

  • Author

    Aliya Nugumanova;Yerzhan Baiburin;Kurmash Apaev

  • Author_Institution
    Department of Information Technologies, East Kazakhstan State Technical University, Ust-Kamenogorsk, Kazakhstan
  • fYear
    2015
  • Firstpage
    619
  • Lastpage
    622
  • Abstract
    In this paper we present a novel approach based on efficient text representation which employs semantic relations between words. We use singular value decomposition of the co-occurrence matrix to overcome its noise and sparseness. Thereby, we obtain a new refined co-occurrence matrix, which allows us to determine relations between words as distances in it. We use these distances as correction factors for the Bag-of-words text representation. In other words, we transform text representation vectors by inclusion relations between words. To validate our representation model, we apply it to binary classification task. We study how our model improves classification of documents, which are relevant to a given domain (topic). For this purpose, we implement Support Vector Machine and classify documents from Reuters-21578 collection. Results of our experiments demonstrate the superiority of our model.
  • Keywords
    "Matrix decomposition","Marine vehicles","Sugar","Support vector machines","Data models"
  • Publisher
    ieee
  • Conference_Titel
    Control, Automation and Systems (ICCAS), 2015 15th International Conference on
  • ISSN
    2093-7121
  • Type

    conf

  • DOI
    10.1109/ICCAS.2015.7364992
  • Filename
    7364992