Title :
Leveraging non-uniform resources for parallel query processing
Author :
Mayr, Tobias ; Bonnet, Philippe ; Gehrke, Johannes ; Seshadri, Praveen
Author_Institution :
IBM Almaden Res. Center, San Jose, CA, USA
Abstract :
Modular clusters are now composed of nonuniform nodes with different CPUs, disks or network cards so that customers can adapt the cluster configuration to the changing technologies and to their changing needs. This challenges dataflow parallelism as the primary load balancing technique of existing parallel database systems. We show in this paper that dataflow parallelism alone is ill suited for modular clusters because running the same operation on different subsets of the data can not fully utilize non-uniform hardware resources. We propose and evaluate new load balancing techniques that blend pipeline parallelism with data parallelism. We consider relational operators as pipelines of fine-grained operations that can be located on different cluster nodes and executed in parallel on different data subsets to best exploit non-uniform resources. We present an experimental study that confirms the feasibility and effectiveness of the new techniques in a parallel execution engine prototype based on the open-source DBMS Predator.
Keywords :
data flow computing; parallel databases; pipeline processing; query processing; resource allocation; data parallelism; dataflow parallelism; load balancing technique; modular clusters; nonuniform hardware resources; open-source DBMS predator; parallel database systems; parallel execution engine prototype; parallel query processing; pipeline parallelism; Assembly; Database systems; Engines; Hardware; Load management; Open source software; Parallel processing; Pipelines; Prototypes; Query processing;
Conference_Titel :
Cluster Computing and the Grid, 2003. Proceedings. CCGrid 2003. 3rd IEEE/ACM International Symposium on
Print_ISBN :
0-7695-1919-9
DOI :
10.1109/CCGRID.2003.1199360