• DocumentCode
    3036208
  • Title

    An Evolutionary Model for Measuring Document Relevance in a Focused Web Spider

  • Author

    Lopez, Israel ; Alvarez-Carrillo, Pavel A. ; Fernandez-Gonzalez, Eduardo R.

  • Author_Institution
    Sch. of Inf., Autonomous Univ. of Sinaloa, Culiacan
  • fYear
    2008
  • fDate
    Sept. 30 2008-Oct. 3 2008
  • Firstpage
    177
  • Lastpage
    182
  • Abstract
    Exploring the Web in search of relevant information is a difficult task due to the vast amount of documents it stores and to the heterogeneity of such documents. Using automated systems such as search engines help users cope with the size of the Web. However the results produced by these systems usually contain documents from a large variety of topics with little or no relevance to the end user. In this work, we propose a model that can be used by a Web spider to selectively explore the Web for relevant documents. In this model, two criteria are used for assessing document relevance; content and structure. These two criteria are integrated in a fuzzy predicate that indicates the degree of relevance of a document with respect to a user-defined topic. The parameters of the proposed model are generated by a genetic algorithm that solves a bi-criteria optimization problem.
  • Keywords
    Internet; fuzzy set theory; genetic algorithms; information retrieval; search engines; bi-criteria optimization problem; document relevance measurement; evolutionary model; focused Web spider; fuzzy predicate; genetic algorithm; search engines; user-defined topic; Automotive engineering; Databases; Evolutionary computation; Genetic algorithms; Informatics; Information retrieval; Mechanical variables measurement; Robots; Search engines; Web pages; Evolutionary Algorithms; Information Retrieval; MCDA; Web Spider;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Electronics, Robotics and Automotive Mechanics Conference, 2008. CERMA '08
  • Conference_Location
    Morelos
  • Print_ISBN
    978-0-7695-3320-9
  • Type

    conf

  • DOI
    10.1109/CERMA.2008.28
  • Filename
    4641067