• DocumentCode
    3256435
  • Title

    Fast Compaction Algorithms for NoSQL Databases

  • Author

    Ghosh, Mainak ; Gupta, Indranil ; Gupta, Shalmoli ; Kumar, Nirman

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Illinois Urbana Champaign, Champaign, IL, USA
  • fYear
    2015
  • fDate
    June 29 2015-July 2 2015
  • Firstpage
    452
  • Lastpage
    461
  • Abstract
    Compaction plays a crucial role in NoSQL systems to ensure a high overall read throughput. In this work, we formally define compaction as an optimization problem that attempts to minimize disk I/O. We prove this problem to be NP-Hard. We then propose a set of algorithms and mathematically analyze upper bounds on worst-case cost. We evaluate the proposed algorithms on real-life workloads. Our results show that our algorithms incur low I/O costs and that a compaction approach using a balanced tree is most preferable.
  • Keywords
    computational complexity; database management systems; trees (mathematics); NP-hard; NoSQL databases; balanced tree; fast compaction algorithms; optimization problem; real-life workloads; worst-case cost; Approximation methods; Binary trees; Bismuth; Compaction; Cost function; Schedules; Vegetation; compaction; greedy approximation algorithm; nosql; np-hard;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Distributed Computing Systems (ICDCS), 2015 IEEE 35th International Conference on
  • Conference_Location
    Columbus, OH
  • ISSN
    1063-6927
  • Type

    conf

  • DOI
    10.1109/ICDCS.2015.53
  • Filename
    7164931