• DocumentCode
    465979
  • Title

    Web Search with Text Categorization Using Probabilistic Framework of SVM

  • Author

    Lim, B.P.C. ; Tsui, M.H. ; Charastrakul, V. ; Shi, D.

  • Author_Institution
    Nanyang Technol. Univ., Singapore
  • Volume
    4
  • fYear
    2006
  • fDate
    8-11 Oct. 2006
  • Firstpage
    2950
  • Lastpage
    2955
  • Abstract
    The role of text categorization algorithms is to deal with the ever increasing amount of documents either online or offline. Its capability to organize numerous documents into pre-defined categories significantly increases the efficiency and decreases human resources. Recently, support vector machine (SVM) gained popularity due to its excellent generalization ability and fast training speed on large dataset. However, the performance of SVM heavily relies on the penalty coefficient parameter and kernel parameters. In this paper, we implement a probabilistic framework for support vector machine (PSVM) that allows for automatic tuning of the penalty coefficient parameters and the kernel parameters via Markov chain Monte Carlo (MCMC) method and apply it to Web searching via text categorization. This probabilistic framework was tested on well known benchmark text categorization dataset. The result from PSVM was compared against the conventional SVM, and K-nearest neighbor with P-tree (KNN-Ptree) and KNN. The proposed methodology is applied to develop a Web search engine.
  • Keywords
    Internet; Markov processes; Monte Carlo methods; classification; probability; support vector machines; text analysis; Markov Chain Monte Carlo method; Web search; kernel parameter; penalty coefficient parameter; probabilistic framework; support vector machine; text categorization; Cybernetics; Data compression; Humans; Kernel; Monte Carlo methods; Support vector machine classification; Support vector machines; Testing; Text categorization; Web search;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man and Cybernetics, 2006. SMC '06. IEEE International Conference on
  • Conference_Location
    Taipei
  • Print_ISBN
    1-4244-0099-6
  • Electronic_ISBN
    1-4244-0100-3
  • Type

    conf

  • DOI
    10.1109/ICSMC.2006.384566
  • Filename
    4274330