DocumentCode
3745238
Title
Change data capture in NoSQL databases: A functional and performance comparison
Author
Felipe Mathias Schmidt;Claudio Geyer;Alberto Schaeffer-Filho;Stefan DeBloch;Yong Hu
Author_Institution
Federal University of Rio Grande do Sul (UFRGS) Institute of Informatics, Porto Alegre, Brazil
fYear
2015
fDate
7/1/2015 12:00:00 AM
Firstpage
562
Lastpage
567
Abstract
Requirements for data storage and processing have reached new levels, with applications relying on the analysis of large amounts of data in order to support everyday life services to end users. Since the costs of maintaining and managing databases are significant, change data capture (CDC) techniques can be used to determine which parts of a data source have changed, and thus assist in the management of large volumes of data in data warehouses. In this paper we investigate a number of CDC techniques suitable for NoSQL databases. CDC techniques can be used to track modifications in a source database, which later can be made available to a target database. Our base system and testbed are based on Apache Cassandra, which is a NoSQL database that offers high performance and scalability. Cassandra is combined with a MapReduce framework, which is used to implement the logic of each CDC technique and is suitable for highly distributed and parallel computing. This paper also presents both a functional comparison of the different CDC techniques, as well as a performance evaluation in a real testbed.
Keywords
"Programming","Data mining","Relational databases","Computers","Metadata"
Publisher
ieee
Conference_Titel
Computers and Communication (ISCC), 2015 IEEE Symposium on
Type
conf
DOI
10.1109/ISCC.2015.7405574
Filename
7405574
Link To Document