Title :
On semi-supervised learning of Dirichlet Mixture Models for Web content classification
Author :
Bai, JingHua ; Li, Xiaoping ; Zhang, Xiaoxian
Author_Institution :
Sch. of Comput. Sci. & Technol., Beijing Inst. of Technol., Beijing, China
Abstract :
This paper presents a method for designing semi-supervised classifier trained on labeled and unlabeled instances. We explore the trade-off between maximizing a discriminative likelihood of labeled data and a generative likelihood of labeled and unlabeled data. Moreover, mixture models are an interesting and flexible model family. The different uses of mixture models include for example generative models and density estimation. This paper investigates semi-supervised learning of mixture models using a unified objective function taking both labeled and unlabeled data into account. We conducted experiments on the WebKB and 20NEWSGROUPS. The results show that unlabeled data results in improvement in classification accuracy over the supervised model.
Keywords :
Internet; data mining; learning (artificial intelligence); maximum likelihood estimation; pattern classification; Dirichlet mixture model; Web content; density estimation; discriminative likelihood maximization; generative model; hybrid classifier; semisupervised learning; unified objective function; Argon; Training; World Wide Web; hybrid classifier; mixture model; semi-supervised learning;
Conference_Titel :
Educational and Information Technology (ICEIT), 2010 International Conference on
Conference_Location :
Chongqing
Print_ISBN :
978-1-4244-8033-3
Electronic_ISBN :
978-1-4244-8035-7
DOI :
10.1109/ICEIT.2010.5607590