DocumentCode :
637054
Title :
Data migration ecosystem for big data invited paper
Author :
Koong Wah Yan ; Perumal, Nagendran M. ; Dillon, Tharam S.
Author_Institution :
Software Dev. Lab., MIMOS Berhad, Kuala Lumpur, Malaysia
fYear :
2013
fDate :
24-26 July 2013
Firstpage :
189
Lastpage :
194
Abstract :
Data Migration is the process of moving data from a system or systems to a new environment. Often, it is a sub-activity of a business application deployment. Big data is defined as data that is huge, has heterogeneous data dictionaries and involves complex manipulation. Due to nature of the process complexity and its resources hungry approach in migrating Big Data, special attention is required to have a proven methodology and ecosystem to govern the process. The Data Migration Ecosystem for Big Data is the productive set of interacting processes, practices and environments, to collect data from one location, storage medium, or hardware/software system, to cleanse, transform and transfer it to another. The processes and practices are governed by rules and disciplines, with the goal of ensuring information is complete, of high accuracy and consistent. This paper is based on our experience in migrating data for a Malaysia government agency, which involves approximately 1 billion rows of data from 31 heterogeneous sources / systems. Some of the data migrated was created in the seventies (1970), for which the business logic has since been enhanced or changed. The challenge is further complicated by available data being from proprietary databases that are non-RDMS compliance and includes data that is manually maintained in Microsoft Excel spreadsheets.
Keywords :
electronic data interchange; government data processing; Malaysia government agency; Microsoft Excel spreadsheets; big data migration ecosystem; business application deployment subactivity; business logic; complex manipulation; hardware-software system; heterogeneous data dictionaries; moving data process; storage medium; Data handling; Data mining; Data storage systems; Databases; Information management; Loading;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Digital Ecosystems and Technologies (DEST), 2013 7th IEEE International Conference on
Conference_Location :
Menlo Park, CA
ISSN :
2150-4938
Print_ISBN :
978-1-4799-0784-7
Type :
conf
DOI :
10.1109/DEST.2013.6611352
Filename :
6611352
Link To Document :
بازگشت