DocumentCode
2828924
Title
Optimizing Communications in Processing Data Integration Queries
Author
Liu, Jia ; Wu, Yongwei ; Yang, Guangwen
Author_Institution
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing
fYear
2008
fDate
20-22 Aug. 2008
Firstpage
131
Lastpage
137
Abstract
Since query processing of data integration needs to access data from numerous wide-distributed sources over network, it is crucial to investigate how to deal with the expensive communication overhead. A staged data integration model is introduced for grid environment in this paper. It takes advantage of the abundant computer nodes to process integrated query over a number of highly-distributed and high-volume data sources. The content-based scheduling algorithm in the model groups the queries over the similar data sources together to enhances the opportunities of data sharing among concurrent queries for the same data source. Furthermore, an approach of multiple queries optimization is proposed to exploit data sharing, and avoid redundant data transfer without sacrificing the autonomy of data sources as well. Experimental results validate that our algorithms improve data integration performance in terms of both communication traffic and response time.
Keywords
data handling; grid computing; optimisation; query processing; content-based scheduling algorithm; data integration queries; data sharing; grid environment; query processing; Computer science; Concurrent computing; Databases; Grid computing; Information retrieval; Laboratories; Large-scale systems; Query processing; Resource management; Scheduling algorithm;
fLanguage
English
Publisher
ieee
Conference_Titel
ChinaGrid Annual Conference, 2008. ChinaGrid '08. The Third
Conference_Location
Dunhuang, Gansu
Print_ISBN
978-0-7695-3306-3
Type
conf
DOI
10.1109/ChinaGrid.2008.7
Filename
4624480
Link To Document