• DocumentCode
    480747
  • Title

    Integrating Structure in the Probabilistic Model for Information Retrieval

  • Author

    Gery, Mathias ; Largeron, Christine ; Thollard, Franck

  • Author_Institution
    Univ. de Lyon, Lyon
  • Volume
    1
  • fYear
    2008
  • fDate
    9-12 Dec. 2008
  • Firstpage
    763
  • Lastpage
    769
  • Abstract
    In databases or in the World Wide Web, many documents are in a structured format (e.g. XML). We propose in this article to extend the classical IR probabilistic model in order to take into account the structure through the weighting of tags. Our approach includes a learning step in which the weight of each tag is computed. This weight estimates the probability that the tag distinguishes the terms which are the most relevant. Our model has been evaluated on a large collection during INEX IR evaluation campaigns.
  • Keywords
    information retrieval; learning (artificial intelligence); probability; World Wide Web; classical information retrieval probabilistic model; document retrieval; integration structure; learning step; weight estimation; Deductive databases; HTML; Indexing; Information retrieval; Intelligent agent; Intelligent structures; Internet; Markup languages; Web sites; XML; XML; probabilistic model; structure; tags;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on
  • Conference_Location
    Sydney, NSW
  • Print_ISBN
    978-0-7695-3496-1
  • Type

    conf

  • DOI
    10.1109/WIIAT.2008.346
  • Filename
    4740545