• DocumentCode
    2828924
  • Title

    Optimizing Communications in Processing Data Integration Queries

  • Author

    Liu, Jia ; Wu, Yongwei ; Yang, Guangwen

  • Author_Institution
    Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing
  • fYear
    2008
  • fDate
    20-22 Aug. 2008
  • Firstpage
    131
  • Lastpage
    137
  • Abstract
    Since query processing of data integration needs to access data from numerous wide-distributed sources over network, it is crucial to investigate how to deal with the expensive communication overhead. A staged data integration model is introduced for grid environment in this paper. It takes advantage of the abundant computer nodes to process integrated query over a number of highly-distributed and high-volume data sources. The content-based scheduling algorithm in the model groups the queries over the similar data sources together to enhances the opportunities of data sharing among concurrent queries for the same data source. Furthermore, an approach of multiple queries optimization is proposed to exploit data sharing, and avoid redundant data transfer without sacrificing the autonomy of data sources as well. Experimental results validate that our algorithms improve data integration performance in terms of both communication traffic and response time.
  • Keywords
    data handling; grid computing; optimisation; query processing; content-based scheduling algorithm; data integration queries; data sharing; grid environment; query processing; Computer science; Concurrent computing; Databases; Grid computing; Information retrieval; Laboratories; Large-scale systems; Query processing; Resource management; Scheduling algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    ChinaGrid Annual Conference, 2008. ChinaGrid '08. The Third
  • Conference_Location
    Dunhuang, Gansu
  • Print_ISBN
    978-0-7695-3306-3
  • Type

    conf

  • DOI
    10.1109/ChinaGrid.2008.7
  • Filename
    4624480