• DocumentCode
    1153385
  • Title

    State-space optimization of ETL workflows

  • Author

    Simitsis, Alkis ; Vassiliadis, Panos ; Sellis, Timos

  • Author_Institution
    Sch. of Electr. & Comput. Eng., Athens Nat. Tech. Univ., Greece
  • Volume
    17
  • Issue
    10
  • fYear
    2005
  • Firstpage
    1404
  • Lastpage
    1419
  • Abstract
    Extraction-transformation-loading (ETL) tools are pieces of software responsible for the extraction of data from several sources, their cleansing, customization, and insertion into a data warehouse. In this paper, we derive into the logical optimization of ETL processes, modeling it as a state-space search problem. We consider each ETL workflow as a state and fabricate the state space through a set of correct state transitions. Moreover, we provide an exhaustive and two heuristic algorithms toward the minimization of the execution cost of an ETL workflow. The heuristic algorithm with greedy characteristics significantly outperforms the other two algorithms for a large set of experimental cases.
  • Keywords
    data mining; distributed databases; query processing; workflow management software; ETL workflows; data repository; data warehouse; database integration; database management; extraction-transformation-loading tools; heterogeneous database; state-space optimization; Costs; Data mining; Data warehouses; Databases; Design optimization; Heuristic algorithms; Minimization methods; Search problems; Software tools; State-space methods; Index Terms- Database management; data warehouse and repository; database integration; heterogeneous databases.; workflow management;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2005.169
  • Filename
    1501823