Title :
Parallelizing XQuery In a Cluster Environment
Author :
Li, Xiaogang ; Agrawal, Gagan
Author_Institution :
Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH
Abstract :
In this paper, we report on a parallel implementation of XQuery. As XQuery is being used for processing large datasets, and/or for compute-intensive applications, efficiency of XQuery implementations is becoming an important issue. Our work has specifically focused on scientific data processing and data mining applications. Parallelization of this class of XQuery queries involves a number of challenges, which include data distribution, parallelization of generalized reductions, and translation to an imperative language like C/C++, so as to invoke efficient parallel communication libraries. In this paper, we report our solutions towards the above problems. By implementing the techniques in a compiler and generating code based on a C++ SAX parser and the message passing interface (MPI), we are able to achieve efficient parallel execution on a cluster of machines
Keywords :
application program interfaces; data mining; message passing; parallel processing; program compilers; query processing; very large databases; C++ SAX parser; MPI; XQuery parallelization; cluster environment; code generation; data distribution; data mining applications; imperative language; large datasets; message passing interface; parallel communication libraries; program compiler; scientific data processing; Application software; Computer applications; Computer architecture; Computer science; Data mining; Data processing; Libraries; Message passing; Query processing; XML;
Conference_Titel :
Database Engineering and Applications Symposium, 2006. IDEAS '06. 10th International
Conference_Location :
Delhi
Print_ISBN :
0-7695-2577-6
DOI :
10.1109/IDEAS.2006.35