Title :
Graph based Partially Supervised Learning of documents
Author :
Sheng, Lingyan ; Ortega, Antonio
Author_Institution :
Dept. of Electr. Eng., Univ. of Southern California, Los Angeles, CA, USA
Abstract :
We propose a novel graph-based algorithm, Graph-Partially Supervised Learning (Graph-PSL), to solve the problem of document classification with positive and unlabeled documents. The key characteristic of the problem is that labeled negative documents are missing. We present a graph-based method to identify reliable negative documents and theoretically explain it by lazy information transfer network. The documents are classified by Transductive Support Vector Machine (TSVM), which can explore the information contained in unlabeled data. We explain how the similarity matrix of the graph and the kernel matrix in TSVM are calculated. We apply Graph-PSL to 20 Newsgroup dataset. The experimental results demonstrate that Graph-PSL identifies negative documents accurately and classifies the unlabeled ones more effectively and more robustly compared to Bayesian based algorithms.
Keywords :
classification; graph theory; learning (artificial intelligence); matrix algebra; support vector machines; text analysis; Graph-PSL; SVM; document classification; graph based partially supervised learning; kernel matrix; lazy information transfer network; similarity matrix; transductive support vector machine; Accuracy; Kernel; Niobium; Reliability; Support vector machines; Training; Vectors; Partially Supervised Learning; Spectral Graph Theory; Text Classification; Transductive Support Vector Machine;
Conference_Titel :
Machine Learning for Signal Processing (MLSP), 2011 IEEE International Workshop on
Conference_Location :
Santander
Print_ISBN :
978-1-4577-1621-8
Electronic_ISBN :
1551-2541
DOI :
10.1109/MLSP.2011.6064566