• DocumentCode
    3323859
  • Title

    An Enhanced Extract-Transform-Load System for Migrating Data in Telecom Billing

  • Author

    Agrawal, Himanshu ; Chafle, Girish ; Goyal, Sunil ; Mittal, Sumit ; Mukherjea, Sougata

  • Author_Institution
    Res. Lab., IBM India, New Delhi
  • fYear
    2008
  • fDate
    7-12 April 2008
  • Firstpage
    1277
  • Lastpage
    1286
  • Abstract
    Data migration has become a priority in many industries, spawned by a variety of business needs. Most of the existing tools for Extract, Transform and Load (ETL) process of data migration are piece-meal and do not present a complete solution. Moreover, while research has focused on the problem of Schema Mapping, a key step in the ETL process, most of the current algorithms do not perform well on real-world data. Researchers have suggested the use of Domain Knowledge to enhance schema mapping. In this paper, we use domain knowledge in an innovative manner to improve schema mapping in an ´actual´ industrial setting. Further, we take a comprehensive view of the data migration problem and present an end-to-end system for the ETL process, utilizing existing tools for each step and building connectors, wherever required. We focus on Data Migration for Telecom Billing and utilize domain knowledge captured in an ontology, a thesaurus and a set of rules to improve schema mapping. Experiments conducted on a real-life data demonstrate the effectiveness of our system and validate the utility of domain knowledge in data migration projects.
  • Keywords
    business data processing; data mining; ontologies (artificial intelligence); data migration; domain knowledge; extract-transform-load system; schema mapping; telecom billing; Communication industry; Computers; Connectors; Content management; Data mining; Laboratories; Ontologies; Real time systems; Telecommunication services; Thesauri;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on
  • Conference_Location
    Cancun
  • Print_ISBN
    978-1-4244-1836-7
  • Electronic_ISBN
    978-1-4244-1837-4
  • Type

    conf

  • DOI
    10.1109/ICDE.2008.4497537
  • Filename
    4497537