• DocumentCode
    2984641
  • Title

    A Novel Semantic Smoothing Method Based on Higher Order Paths for Text Classification

  • Author

    Poyraz, M. ; Kilimci, Z.H. ; Ganiz, Murat Can

  • Author_Institution
    Comput. Eng. Dept., Dogu Univ., Istanbul, Turkey
  • fYear
    2012
  • fDate
    10-13 Dec. 2012
  • Firstpage
    615
  • Lastpage
    624
  • Abstract
    It has been shown that Latent Semantic Indexing (LSI) takes advantage of implicit higher-order (or latent) structure in the association of terms and documents. Higher order relations in LSI capture "latent semantics". Inspired by this, a novel Bayesian framework for classification named Higher Order Naïve Bayes (HONB), which can explicitly make use of these higher-order relations, has been introduced previously. We present a novel semantic smoothing method named Higher Order Smoothing (HOS) for the Naive Bayes algorithm. HOS is built on a similar graph based data representation of HONB which allows semantics in higher order paths to be exploited. Additionally, we take the concept one step further in HOS and exploited the relationships between instances of different classes in order to improve the parameter estimation when dealing with insufficient labeled data. As a result, we have not only been able to move beyond instance boundaries, but also class boundaries to exploit the latent information in higher-order paths. The results of our extensive experiments demonstrate the value of HOS on several benchmark datasets.
  • Keywords
    Bayes methods; indexing; pattern classification; smoothing methods; text analysis; Bayesian framework; HONB; HOS; LSI; higher order naïve Bayes; higher order paths; higher order smoothing; implicit higher-order structure; latent semantic indexing; semantic smoothing method; text classification; Classification algorithms; Niobium; Semantics; Smoothing methods; Support vector machines; Text categorization; Training; Higher Order Naive Bayes; Higher Order Smoothing; Naive Bayes; Semantic Smoothing; Text Classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining (ICDM), 2012 IEEE 12th International Conference on
  • Conference_Location
    Brussels
  • ISSN
    1550-4786
  • Print_ISBN
    978-1-4673-4649-8
  • Type

    conf

  • DOI
    10.1109/ICDM.2012.109
  • Filename
    6413865