• DocumentCode
    3230701
  • Title

    Rapid Synthesis of Domain-Specific Web Search Engines Based on Semi-Automatic Training-Example Generation

  • Author

    Nabeshima, Hidetomo ; Miyagawa, Reiko ; Suzuki, Yuki ; Iwanuma, Koji

  • Author_Institution
    Yamanashi Univ., Kofu
  • fYear
    2006
  • fDate
    18-22 Dec. 2006
  • Firstpage
    769
  • Lastpage
    772
  • Abstract
    In this paper, we propose two kinds of semi-automatic training-example generation algorithms for rapidly synthesizing a domain-specific Web search engine. We use the keyword spice model, as a basic framework, which is an excellent approach for building a domain-specific search engine with high precision and high recall. The keyword spice model, however, requires a huge amount of training examples which should be classified by hand. For overcoming this problem, we propose two kinds of refinement algorithms based on semi-automatic training-example generation: (i) the sample decision tree based approach, and (ii) the similarity based approach. These approaches make it possible to build a highly accurate domain-specific search engine with a little time and effort. The experimental results show that our approaches are very effective and practical for the personalization of a general-purpose search engine
  • Keywords
    decision trees; information retrieval; learning (artificial intelligence); search engines; domain-specific Web search engine personalization; keyword spice model; refinement algorithm; sample decision tree learning algorithm; semiautomatic training-example generation algorithm; similarity based approach; Classification tree analysis; Decision trees; Impedance matching; Information retrieval; Internet; Search engines; Web pages; Web search;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence, 2006. WI 2006. IEEE/WIC/ACM International Conference on
  • Conference_Location
    Hong Kong
  • Print_ISBN
    0-7695-2747-7
  • Type

    conf

  • DOI
    10.1109/WI.2006.143
  • Filename
    4061470