Title :
Combining active learning and relevance vector machines for text classification
Author :
Silva, C. ; Ribeiro, B.
Author_Institution :
Polytech. Inst. of Leiria, Leiria
Abstract :
Relevance vector machines (RVM) have proven successful in many learning tasks. However, in large applications, they scale poorly. In many settings there is a large amount of unlabeled data which could be actively chosen by a learner and integrated in the learning procedure. The idea is to improve performance meanwhile reducing costs from data categorization. In this paper we propose an active learning RVM method based on the kernel trick. The underpinning idea is to define a working space between the relevance vectors (RV) initially obtained in a small labeled data set and the new unlabeled examples, where the most informative instances are chosen. By using kernel distance metrics, such a space can be defined and more informative examples can be added to the training set, increasing performance even though the problem dimension is not significantly affected. We detail the proposed method giving illustrative examples in the Reuters-21578 benchmark. Results show performance improvement and scalability.
Keywords :
learning (artificial intelligence); pattern classification; support vector machines; text analysis; Reuters-21578 benchmark; active learning; data categorization; kernel distance metrics; kernel trick; relevance vector machines; text classification; Conference management; Costs; Extraterrestrial measurements; Kernel; Machine learning; Scalability; Support vector machine classification; Support vector machines; Technology management; Text categorization;
Conference_Titel :
Machine Learning and Applications, 2007. ICMLA 2007. Sixth International Conference on
Conference_Location :
Cincinnati, OH
Print_ISBN :
978-0-7695-3069-7
DOI :
10.1109/ICMLA.2007.72