• DocumentCode
    710135
  • Title

    The DBMS - your big data sommelier

  • Author

    Kargin, Yagiz ; Kersten, Martin ; Manegold, Stefan ; Pirk, Holger

  • Author_Institution
    Database Archit. Group, Centrum Wiskunde & Inf. (CWI), Amsterdam, Netherlands
  • fYear
    2015
  • fDate
    13-17 April 2015
  • Firstpage
    1119
  • Lastpage
    1130
  • Abstract
    When addressing the problem of “big” data volume, preparation costs are one of the key challenges: the high costs for loading, aggregating and indexing data leads to a long data-to-insight time. In addition to being a nuisance to the end-user, this latency prevents real-time analytics on “big” data. Fortunately, data often comes in semantic chunks such as files that contain data items that share some characteristics such as acquisition time or location. A data management system that exploits this trait can significantly lower the data preparation costs and the associated data-to-insight time by only investing in the preparation of the relevant chunks. In this paper, we develop such a system as an extension of an existing relational DBMS (MonetDB). To this end, we develop a query processing paradigm and data storage model that are partial-loading aware. The result is a system that can make a 1.2 TB dataset (consisting of 4000 chunks) ready for querying in less than 3 minutes on a single server-class machine while maintaining good query processing performance.
  • Keywords
    Big Data; data models; data preparation; query processing; relational databases; storage management; Big Data analytics; Big Data sommelier; Big Data volume preparation costs; MonetDB; associated data-to-insight time; data aggregation; data indexing; data items; data loading; data management system; data storage model; partial-loading aware; query processing performance; relational DBMS; semantic chunks; single server-class machine; Loading; Optimization; Query processing; Relational databases; Semantics; Transforms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering (ICDE), 2015 IEEE 31st International Conference on
  • Conference_Location
    Seoul
  • Type

    conf

  • DOI
    10.1109/ICDE.2015.7113361
  • Filename
    7113361