• DocumentCode
    3745238
  • Title

    Change data capture in NoSQL databases: A functional and performance comparison

  • Author

    Felipe Mathias Schmidt;Claudio Geyer;Alberto Schaeffer-Filho;Stefan DeBloch;Yong Hu

  • Author_Institution
    Federal University of Rio Grande do Sul (UFRGS) Institute of Informatics, Porto Alegre, Brazil
  • fYear
    2015
  • fDate
    7/1/2015 12:00:00 AM
  • Firstpage
    562
  • Lastpage
    567
  • Abstract
    Requirements for data storage and processing have reached new levels, with applications relying on the analysis of large amounts of data in order to support everyday life services to end users. Since the costs of maintaining and managing databases are significant, change data capture (CDC) techniques can be used to determine which parts of a data source have changed, and thus assist in the management of large volumes of data in data warehouses. In this paper we investigate a number of CDC techniques suitable for NoSQL databases. CDC techniques can be used to track modifications in a source database, which later can be made available to a target database. Our base system and testbed are based on Apache Cassandra, which is a NoSQL database that offers high performance and scalability. Cassandra is combined with a MapReduce framework, which is used to implement the logic of each CDC technique and is suitable for highly distributed and parallel computing. This paper also presents both a functional comparison of the different CDC techniques, as well as a performance evaluation in a real testbed.
  • Keywords
    "Programming","Data mining","Relational databases","Computers","Metadata"
  • Publisher
    ieee
  • Conference_Titel
    Computers and Communication (ISCC), 2015 IEEE Symposium on
  • Type

    conf

  • DOI
    10.1109/ISCC.2015.7405574
  • Filename
    7405574