• DocumentCode
    1791638
  • Title

    Automating data integration with HiperFuse

  • Author

    Huang, Edward ; Quiroz, Andres ; Ceriani, Luca

  • Author_Institution
    Palo Alto Res. Center, Interaction & Analytics Lab., Palo Alto, CA, USA
  • fYear
    2014
  • fDate
    27-30 Oct. 2014
  • Firstpage
    863
  • Lastpage
    867
  • Abstract
    Integrating heterogeneous datasets has been a significant barrier to many analytics tasks, due to the variety in structure and level of cleanliness of raw datasets requiring one-off ETL code. We propose HiperFuse, which significantly automates the data integration process by providing a declarative interface, robust type inference, extensible domain-specific data models, and a data integration planner which optimizes for plan completion time.
  • Keywords
    data integration; data models; HiperFuse; data integration planner; declarative interface; extensible domain-specific data models; heterogeneous dataset integration; one-off ETL code; robust type inference; Benchmark testing; Computational modeling; Data integration; Data models; IP networks; Libraries; Planning; DSL; ETL; automation; data fusion; data integration; dataflow optimization; declarative; planning; scheduling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Big Data (Big Data), 2014 IEEE International Conference on
  • Conference_Location
    Washington, DC
  • Type

    conf

  • DOI
    10.1109/BigData.2014.7004316
  • Filename
    7004316