Title : 
Automating data integration with HiperFuse
         
        
            Author : 
Huang, Edward ; Quiroz, Andres ; Ceriani, Luca
         
        
            Author_Institution : 
Palo Alto Res. Center, Interaction & Analytics Lab., Palo Alto, CA, USA
         
        
        
        
        
        
            Abstract : 
Integrating heterogeneous datasets has been a significant barrier to many analytics tasks, due to the variety in structure and level of cleanliness of raw datasets requiring one-off ETL code. We propose HiperFuse, which significantly automates the data integration process by providing a declarative interface, robust type inference, extensible domain-specific data models, and a data integration planner which optimizes for plan completion time.
         
        
            Keywords : 
data integration; data models; HiperFuse; data integration planner; declarative interface; extensible domain-specific data models; heterogeneous dataset integration; one-off ETL code; robust type inference; Benchmark testing; Computational modeling; Data integration; Data models; IP networks; Libraries; Planning; DSL; ETL; automation; data fusion; data integration; dataflow optimization; declarative; planning; scheduling;
         
        
        
        
            Conference_Titel : 
Big Data (Big Data), 2014 IEEE International Conference on
         
        
            Conference_Location : 
Washington, DC
         
        
        
            DOI : 
10.1109/BigData.2014.7004316