DocumentCode :
1642241
Title :
Text classification with enhanced semi-supervised fuzzy clustering
Author :
Keswani, Girish ; Hall, Lawrence O.
Author_Institution :
Dept. of Comput. Sci., Univ. of South Florida, Tampa, FL, USA
Volume :
1
fYear :
2002
fDate :
6/24/1905 12:00:00 AM
Firstpage :
621
Lastpage :
626
Abstract :
Given the increasing volume of information available on the Web, it is important to meaningfully organize online documents. Hence, the design of efficient and accurate text classification systems is of interest. In this paper, we explore a framework, in which we improve the performance of a base classifier, by clustering unlabeled data with labeled data using probabilistic and fuzzy approaches. We have used expectation maximization and semi-supervised fuzzy c-means for clustering the unlabeled data with labeled data. The naive Bayes classifier was the base classifier utilizing both the original labeled data and then additional data labeled through clustering. Utilizing unlabeled data from semi-supervised fuzzy clustering results in an improved classifier
Keywords :
Bayes methods; Internet; classification; fuzzy set theory; information resources; pattern clustering; probability; text analysis; World Wide Web; enhanced semi-supervised fuzzy clustering; expectation maximization; fuzzy approach; labeled data; naive Bayes classifier; online document organization; probabilistic approach; semi-supervised fuzzy c-means; text classification; unlabeled data clustering; Classification algorithms; Clustering algorithms; Computer science; Databases; Electronic mail; Feeds; Machine learning algorithms; Semisupervised learning; Text categorization; Web sites;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems, 2002. FUZZ-IEEE'02. Proceedings of the 2002 IEEE International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
0-7803-7280-8
Type :
conf
DOI :
10.1109/FUZZ.2002.1005064
Filename :
1005064
Link To Document :
بازگشت