• DocumentCode
    1023613
  • Title

    Truth Discovery with Multiple Conflicting Information Providers on the Web

  • Author

    Yin, Xiaoxin ; Han, Jiawei ; Yu, Philip S.

  • Author_Institution
    Microsoft Res., Microsoft Corp., Redmond, WA
  • Volume
    20
  • Issue
    6
  • fYear
    2008
  • fDate
    6/1/2008 12:00:00 AM
  • Firstpage
    796
  • Lastpage
    808
  • Abstract
    The World Wide Web has become the most important information source for most of us. Unfortunately, there is no guarantee for the correctness of information on the Web. Moreover, different websites often provide conflicting information on a subject, such as different specifications for the same product. In this paper, we propose a new problem, called Veracity, i.e., conformity to truth, which studies how to find true facts from a large amount of conflicting information on many subjects that is provided by various websites. We design a general framework for the Veracity problem and invent an algorithm, called TRUTHFlNDER, which utilizes the relationships between websites and their information, i.e., a website is trustworthy if it provides many pieces of true information, and a piece of information is likely to be true if it is provided by many trustworthy websites. An iterative method is used to infer the trustworthiness of websites and the correctness of information from each other. Our experiments show that TRUTHFlNDER successfully finds true facts among conflicting information and identifies trustworthy websites better than the popular search engines.
  • Keywords
    Web sites; search engines; TRUTHFlNDER; Veracity; Websites; World Wide Web; information source; multiple conflicting information providers; search engines; truth discovery; Data mining; Web mining;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2007.190745
  • Filename
    4415269