DocumentCode
2984641
Title
A Novel Semantic Smoothing Method Based on Higher Order Paths for Text Classification
Author
Poyraz, M. ; Kilimci, Z.H. ; Ganiz, Murat Can
Author_Institution
Comput. Eng. Dept., Dogu Univ., Istanbul, Turkey
fYear
2012
fDate
10-13 Dec. 2012
Firstpage
615
Lastpage
624
Abstract
It has been shown that Latent Semantic Indexing (LSI) takes advantage of implicit higher-order (or latent) structure in the association of terms and documents. Higher order relations in LSI capture "latent semantics". Inspired by this, a novel Bayesian framework for classification named Higher Order Naïve Bayes (HONB), which can explicitly make use of these higher-order relations, has been introduced previously. We present a novel semantic smoothing method named Higher Order Smoothing (HOS) for the Naive Bayes algorithm. HOS is built on a similar graph based data representation of HONB which allows semantics in higher order paths to be exploited. Additionally, we take the concept one step further in HOS and exploited the relationships between instances of different classes in order to improve the parameter estimation when dealing with insufficient labeled data. As a result, we have not only been able to move beyond instance boundaries, but also class boundaries to exploit the latent information in higher-order paths. The results of our extensive experiments demonstrate the value of HOS on several benchmark datasets.
Keywords
Bayes methods; indexing; pattern classification; smoothing methods; text analysis; Bayesian framework; HONB; HOS; LSI; higher order naïve Bayes; higher order paths; higher order smoothing; implicit higher-order structure; latent semantic indexing; semantic smoothing method; text classification; Classification algorithms; Niobium; Semantics; Smoothing methods; Support vector machines; Text categorization; Training; Higher Order Naive Bayes; Higher Order Smoothing; Naive Bayes; Semantic Smoothing; Text Classification;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining (ICDM), 2012 IEEE 12th International Conference on
Conference_Location
Brussels
ISSN
1550-4786
Print_ISBN
978-1-4673-4649-8
Type
conf
DOI
10.1109/ICDM.2012.109
Filename
6413865
Link To Document