DocumentCode
465979
Title
Web Search with Text Categorization Using Probabilistic Framework of SVM
Author
Lim, B.P.C. ; Tsui, M.H. ; Charastrakul, V. ; Shi, D.
Author_Institution
Nanyang Technol. Univ., Singapore
Volume
4
fYear
2006
fDate
8-11 Oct. 2006
Firstpage
2950
Lastpage
2955
Abstract
The role of text categorization algorithms is to deal with the ever increasing amount of documents either online or offline. Its capability to organize numerous documents into pre-defined categories significantly increases the efficiency and decreases human resources. Recently, support vector machine (SVM) gained popularity due to its excellent generalization ability and fast training speed on large dataset. However, the performance of SVM heavily relies on the penalty coefficient parameter and kernel parameters. In this paper, we implement a probabilistic framework for support vector machine (PSVM) that allows for automatic tuning of the penalty coefficient parameters and the kernel parameters via Markov chain Monte Carlo (MCMC) method and apply it to Web searching via text categorization. This probabilistic framework was tested on well known benchmark text categorization dataset. The result from PSVM was compared against the conventional SVM, and K-nearest neighbor with P-tree (KNN-Ptree) and KNN. The proposed methodology is applied to develop a Web search engine.
Keywords
Internet; Markov processes; Monte Carlo methods; classification; probability; support vector machines; text analysis; Markov Chain Monte Carlo method; Web search; kernel parameter; penalty coefficient parameter; probabilistic framework; support vector machine; text categorization; Cybernetics; Data compression; Humans; Kernel; Monte Carlo methods; Support vector machine classification; Support vector machines; Testing; Text categorization; Web search;
fLanguage
English
Publisher
ieee
Conference_Titel
Systems, Man and Cybernetics, 2006. SMC '06. IEEE International Conference on
Conference_Location
Taipei
Print_ISBN
1-4244-0099-6
Electronic_ISBN
1-4244-0100-3
Type
conf
DOI
10.1109/ICSMC.2006.384566
Filename
4274330
Link To Document