DocumentCode :
2828924
Title :
Optimizing Communications in Processing Data Integration Queries
Author :
Liu, Jia ; Wu, Yongwei ; Yang, Guangwen
Author_Institution :
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing
fYear :
2008
fDate :
20-22 Aug. 2008
Firstpage :
131
Lastpage :
137
Abstract :
Since query processing of data integration needs to access data from numerous wide-distributed sources over network, it is crucial to investigate how to deal with the expensive communication overhead. A staged data integration model is introduced for grid environment in this paper. It takes advantage of the abundant computer nodes to process integrated query over a number of highly-distributed and high-volume data sources. The content-based scheduling algorithm in the model groups the queries over the similar data sources together to enhances the opportunities of data sharing among concurrent queries for the same data source. Furthermore, an approach of multiple queries optimization is proposed to exploit data sharing, and avoid redundant data transfer without sacrificing the autonomy of data sources as well. Experimental results validate that our algorithms improve data integration performance in terms of both communication traffic and response time.
Keywords :
data handling; grid computing; optimisation; query processing; content-based scheduling algorithm; data integration queries; data sharing; grid environment; query processing; Computer science; Concurrent computing; Databases; Grid computing; Information retrieval; Laboratories; Large-scale systems; Query processing; Resource management; Scheduling algorithm;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
ChinaGrid Annual Conference, 2008. ChinaGrid '08. The Third
Conference_Location :
Dunhuang, Gansu
Print_ISBN :
978-0-7695-3306-3
Type :
conf
DOI :
10.1109/ChinaGrid.2008.7
Filename :
4624480
Link To Document :
بازگشت