Title :
Fast accurate summary warehouses with distributed summaries
Author :
Furtado, Pedro ; Costa, Joao Pedro
Abstract :
Large data warehouses (DW) put a major challenge in what concerns performance and scalability, as users request instant answers to their queries. Traditional solutions relying on very expensive architectures and structures cannot turn every complex aggregation query into minutes or seconds answers. The summary warehouse (SW) achieves such a speedup using only general-purpose sampling summaries well fit for aggregated exploration analysis. The major limitation of SWs results from the tradeoff between accuracy and speed: smaller, faster summaries cannot answer less-aggregated queries. We propose a simple and cheap strategy to meet these conflicting requirements and deliver unseen speedup by taking advantage of distributed computation ubiquity. The distributed summaries approach (DS) proposed in this paper manages a distributed set of summaries that are put in available computing nodes of a local area network to achieve very fast query processing, while guaranteeing enough accuracy.
Keywords :
computational complexity; data mining; data warehouses; database management systems; local area networks; query processing; ubiquitous computing; DW; OLAP query; SW; aggregated exploration analysis; complex aggregation query; computing nodes; data warehouse; distributed computation ubiquity; distributed summary; local area network; online analytical processing; query management; query processing accuracy; sampling summary; scalability; summary warehouse; user request; Computer architecture; Computer networks; Concurrent computing; Data warehouses; Distributed computing; Hardware; Investments; Query processing; Sampling methods; Scalability;
Conference_Titel :
Database Engineering and Applications Symposium, 2003. Proceedings. Seventh International
Print_ISBN :
0-7695-1981-4
DOI :
10.1109/IDEAS.2003.1214958