DocumentCode :
530247
Title :
On semi-supervised learning of Dirichlet Mixture Models for Web content classification
Author :
Bai, JingHua ; Li, Xiaoping ; Zhang, Xiaoxian
Author_Institution :
Sch. of Comput. Sci. & Technol., Beijing Inst. of Technol., Beijing, China
Volume :
2
fYear :
2010
fDate :
17-19 Sept. 2010
Abstract :
This paper presents a method for designing semi-supervised classifier trained on labeled and unlabeled instances. We explore the trade-off between maximizing a discriminative likelihood of labeled data and a generative likelihood of labeled and unlabeled data. Moreover, mixture models are an interesting and flexible model family. The different uses of mixture models include for example generative models and density estimation. This paper investigates semi-supervised learning of mixture models using a unified objective function taking both labeled and unlabeled data into account. We conducted experiments on the WebKB and 20NEWSGROUPS. The results show that unlabeled data results in improvement in classification accuracy over the supervised model.
Keywords :
Internet; data mining; learning (artificial intelligence); maximum likelihood estimation; pattern classification; Dirichlet mixture model; Web content; density estimation; discriminative likelihood maximization; generative model; hybrid classifier; semisupervised learning; unified objective function; Argon; Training; World Wide Web; hybrid classifier; mixture model; semi-supervised learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Educational and Information Technology (ICEIT), 2010 International Conference on
Conference_Location :
Chongqing
Print_ISBN :
978-1-4244-8033-3
Electronic_ISBN :
978-1-4244-8035-7
Type :
conf
DOI :
10.1109/ICEIT.2010.5607590
Filename :
5607590
Link To Document :
بازگشت