Title :
Speeding ETL Processing in Data Warehouses Using High-Performance Joins for Changed Data Capture (CDC)
Author :
Tank, Darshan M. ; Ganatra, Amit ; Kosta, Y.P. ; Bhensdadia, C.K.
Author_Institution :
Dept. of IT, Charotar Univ. of Sci. & Technol., Anand, India
Abstract :
In today´s fast-changing, competitive environment, a complaint frequently heard by data warehouse users is that access to time-critical data is too slow. Shrinking batch windows and data volume that increases exponentially are placing increasing demands on data warehouses to deliver instantly-available information. Additionally, data warehouses must be able to consistently generate accurate results. But achieving accuracy and speed with large, diverse sets of data can be challenging. Various operations can be used to optimize data manipulation and thus accelerate data warehouse processes. In this paper we have introduced two such operations: 1. Join and 2. Aggregation-which will play an integral role during preprocessing as well in manipulating and consolidating data in a data warehouse. Our approach demonstrate how we can save hours or even days, when processing large amounts of data for ETL, data warehousing, business intelligence (BI) and other mission critical applications.
Keywords :
competitive intelligence; data warehouses; ETL processing; business intelligence; changed data capture; data manipulation optimization; data warehouses; extract transform load; high-performance joins; Business; Data mining; Data warehouses; Distributed databases; Real time systems; Warehousing; Business Intelligence; Change Data Capture (CDC); Extract-Transform-Load (ETL); Near Real-Time Data Warehousing;
Conference_Titel :
Advances in Recent Technologies in Communication and Computing (ARTCom), 2010 International Conference on
Conference_Location :
Kottayam
Print_ISBN :
978-1-4244-8093-7
Electronic_ISBN :
978-0-7695-4201-0
DOI :
10.1109/ARTCom.2010.63