DocumentCode :
1973978
Title :
SETL: A scalable and high performance ETL system
Author :
Sun, Kunjian ; Lan, Yuqing
Author_Institution :
Software Eng. Inst., Beihang Univ., Beijing, China
Volume :
1
fYear :
2012
fDate :
20-21 Oct. 2012
Firstpage :
6
Lastpage :
9
Abstract :
In order to extract, transform and load large scale data from heterogeneous data sources into data warehouse efficiently, the SETL system is designed and implemented in this paper. By using PERL subroutine attribute and data partition, SETL can implement ETL job easily and perform ETL job efficiently, and the plug-in design makes SETL with high scalability, and the design that performing one ETL job in one ETL pipeline makes SETL with distribution environment support. For illustration, one ETL job example is utilized to show the scalability in designing ETL job and show the high efficiency in processing large scale data. Experiments prove that SETL can extract, transform, and load large scale data into data warehouse efficiently. The SETL system simplified the ETL job design and implementation and can deal with heterogeneous data sources flexibly. It is a light-weighted, scalable and high-performance ETL system.
Keywords :
data warehouses; ETL job design; ETL pipeline; PERL subroutine attribute; SETL system; data partition; data warehouse; extract-transform-load process; heterogeneous data sources; high-performance ETL system; large scale data processing; plug-in design; Algorithms; Data mining; Data warehouses; Databases; Loading; Pipelines; Scalability; Data Loading; Data Transformation; Database; ETL;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
System Science, Engineering Design and Manufacturing Informatization (ICSEM), 2012 3rd International Conference on
Conference_Location :
Chengdu
Print_ISBN :
978-1-4673-0914-1
Type :
conf
DOI :
10.1109/ICSSEM.2012.6340727
Filename :
6340727
Link To Document :
بازگشت