DocumentCode :
2774098
Title :
Towards a Universal Text Classifier: Transfer Learning Using Encyclopedic Knowledge
Author :
Wang, Pu ; Domeniconi, Carlotta
Author_Institution :
Dept. of Comput. Sci., George Mason Univ., Fairfax, VA, USA
fYear :
2009
fDate :
6-6 Dec. 2009
Firstpage :
435
Lastpage :
440
Abstract :
Document classification is a key task for many text mining applications. However, traditional text classification requires labeled data to construct reliable and accurate classifiers. Unfortunately, labeled data are seldom available. In this work, we propose a universal text classifier, which does not require any labeled document. Our approach simulates the capability of people to classify documents based on background knowledge. As such, we build a classifier that can effectively group documents based on their content, under the guidance of few words describing the classes of interest. Background knowledge is modeled using encyclopedic knowledge, namely Wikipedia. The universal text classifier can also be used to perform document retrieval. In our experiments with real data we test the feasibility of our approach for both the classification and retrieval tasks.
Keywords :
data mining; information retrieval; text analysis; Wikipedia; background knowledge; document classification; document retrieval; encyclopedic knowledge; learning transfer; text mining; universal text classifier; Application software; Computer science; Conferences; Data mining; Testing; Text categorization; Text mining; Training data; USA Councils; Wikipedia;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining Workshops, 2009. ICDMW '09. IEEE International Conference on
Conference_Location :
Miami, FL
Print_ISBN :
978-1-4244-5384-9
Electronic_ISBN :
978-0-7695-3902-7
Type :
conf
DOI :
10.1109/ICDMW.2009.101
Filename :
5360444
Link To Document :
بازگشت