• DocumentCode
    1350757
  • Title

    Approximate Aggregations in Structured P2P Networks

  • Author

    Sun, Dalie ; Wu, Sai ; Jiang, Shouxu ; Li, Jianzhong

  • Author_Institution
    Dept. of Comput. Sci. & Technol., Harbin Inst. of Technol., Harbin, China
  • Volume
    23
  • Issue
    11
  • fYear
    2011
  • Firstpage
    1748
  • Lastpage
    1752
  • Abstract
    In corporate networks, daily business data are generated in gigabytes or even terabytes. It is costly to process aggregate queries in those systems. In this paper, we propose PACA, a probably approximately correct aggregate query processing scheme, for answering aggregate queries in structured Peer-to-Peer (P2P) network. PACA retrieves random samples from peers´ databases and applies the samples to process queries. Instead of scanning the entire database of each peer, PACA only accesses a small random number of data. Moreover, based on the query distribution,PACA publishes a precomputed synopsis and uses the synopsis to answer future queries. Most queries are expected to be answered by the precomputed synopsis partially or fully. And the synopsis is adaptively tuned to follow the query distribution. Experiments on the PlanetLab show the effectiveness of the approach.
  • Keywords
    business data processing; distributed databases; peer-to-peer computing; query processing; question answering (information retrieval); P2P network; PACA; PlanetLab; aggregate query answering; business data; corporate networks; peer database; probably approximately correct aggregate query processing; query distribution; sample retrieval; structured peer-to-peer network; Aggregates; Estimation; Indexes; Peer to peer computing; Query processing; Servers; BATON; Peer-to-Peer; approximate query processing.;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2010.198
  • Filename
    5601726