Title :
Exploring Neighborhood Influence in Text Classification
Author :
Le, Nam Do-Hoang ; Tran, Thai-Son ; Tran, Minh-Triet
Author_Institution :
Fac. of Inf. Technol., Univ. of Sci., Ho Chi Minh City, Vietnam
Abstract :
Standard supervised learning approaches have been widely applied on the text classification problem. These standard approaches exploit only the local content of the document. However, the additional information in the relationship between the items can be used to improve the overall accuracy of the classification process. To make use of this information, the authors propose a statistical model to capture both the contents and labels from each link the neighborhood. This link model is then incorporated with the Markov Random Field model to form the soft labeling model for text classification. This new approach has combined both the local content and the influence from the neighborhood. The results of soft labeling model on standard data sets are also promising. Moreover, the new model can be applied on not only the text classification problem but also many kinds of richly structured data sets.
Keywords :
Markov processes; learning (artificial intelligence); pattern classification; random processes; set theory; statistical analysis; text analysis; Markov random field model; accuracy improvement; document local contents; link model; soft-labeling model; standard data sets; standard supervised learning approaches; statistical model; text classification problem; Accuracy; Computational modeling; Correlation; Labeling; Logistics; Support vector machines; Text categorization; bibliological networks; document categorization; graphical model;
Conference_Titel :
Knowledge and Systems Engineering (KSE), 2012 Fourth International Conference on
Conference_Location :
Danang
Print_ISBN :
978-1-4673-2171-6
DOI :
10.1109/KSE.2012.35