DocumentCode
3697484
Title
Log-based change data capture from schema-free document stores using MapReduce
Author
Kun Ma;Bo Yang
Author_Institution
Shandong Provincial Key Laboratory of Network Based Intelligent Computing, University of Jinan, Jinan, China
fYear
2015
fDate
6/1/2015 12:00:00 AM
Firstpage
1
Lastpage
6
Abstract
Change data capture (CDC) is an approach to data integration that is used to determine and track the data that has changed so that action can be taken using the change data. However, the state of art of change data capture (CDC) in the context of document-oriented NoSQL databases is not mature. Therefore, it is urgent to require a NoSQL CDC solution. Although some manufacturers of NoSQL databases start to research on CDC for NoSQL, these approaches are just for the specific product. In our paper, we propose a log-based CDC approach from abstract schema-free document stores using MapReduce. The process is divided into map and reduce procedures, benefited from MapReduce framework, to generate cell state models (CSMs). In order to infinitely look back to any revision, we enable our proposed CSM to support copy-modify-merge model to manage the revisions of change data. Finally, experimental results show that this approach is independent and appropriate for document stores, with high performance and throughput capacity.
Keywords
"Databases","Data models","Data mining","Computer architecture","Context","Microprocessors","Mathematical model"
Publisher
ieee
Conference_Titel
Cloud Technologies and Applications (CloudTech), 2015 International Conference on
Type
conf
DOI
10.1109/CloudTech.2015.7336969
Filename
7336969
Link To Document