• DocumentCode
    2034250
  • Title

    P-top-k queries in probabilistic framework from information extraction models

  • Author

    He, Ming ; Du, Yong-ping

  • Author_Institution
    Coll. of Comput. Sci., Beijing Univ. of Technol., Beijing, China
  • Volume
    5
  • fYear
    2010
  • fDate
    10-12 Aug. 2010
  • Firstpage
    2376
  • Lastpage
    2379
  • Abstract
    Many applications today need to manage data that is uncertain, such as information extraction (IE), data integration, sensor RFID networks, and scientific experiments. Top-k queries are often natural and useful in analyzing uncertain data in those applications. In this paper, we study the problem of answering top-k queries in a probabilistic framework from a state-of-the-art statistical IE model-semi-Conditional Random Fields (CRFs)-in the setting of Probabilistic Databases that treat statistical models as first-class data objects. We investigate the problem of ranking the answers to Probabilistic Databases query. We present efficient algorithm for finding the best approximating parameters in such a framework to efficiently retrieve the top-k ranked results. An empirical study using real data sets demonstrates the effectiveness of probabilistic top-k queries and the efficiency of our method.
  • Keywords
    probability; query processing; conditional random fields; data integration; information extraction; p-top-k queries; probabilistic databases query; probabilistic framework; scientific experiments; sensor RFID networks; Computational modeling; Data mining; Data models; Databases; Probabilistic logic; Training; Uncertainty; conditional random fields; information extraction; probabilistic databases; uncertain data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on
  • Conference_Location
    Yantai, Shandong
  • Print_ISBN
    978-1-4244-5931-5
  • Type

    conf

  • DOI
    10.1109/FSKD.2010.5569526
  • Filename
    5569526