• DocumentCode
    3697484
  • Title

    Log-based change data capture from schema-free document stores using MapReduce

  • Author

    Kun Ma;Bo Yang

  • Author_Institution
    Shandong Provincial Key Laboratory of Network Based Intelligent Computing, University of Jinan, Jinan, China
  • fYear
    2015
  • fDate
    6/1/2015 12:00:00 AM
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Change data capture (CDC) is an approach to data integration that is used to determine and track the data that has changed so that action can be taken using the change data. However, the state of art of change data capture (CDC) in the context of document-oriented NoSQL databases is not mature. Therefore, it is urgent to require a NoSQL CDC solution. Although some manufacturers of NoSQL databases start to research on CDC for NoSQL, these approaches are just for the specific product. In our paper, we propose a log-based CDC approach from abstract schema-free document stores using MapReduce. The process is divided into map and reduce procedures, benefited from MapReduce framework, to generate cell state models (CSMs). In order to infinitely look back to any revision, we enable our proposed CSM to support copy-modify-merge model to manage the revisions of change data. Finally, experimental results show that this approach is independent and appropriate for document stores, with high performance and throughput capacity.
  • Keywords
    "Databases","Data models","Data mining","Computer architecture","Context","Microprocessors","Mathematical model"
  • Publisher
    ieee
  • Conference_Titel
    Cloud Technologies and Applications (CloudTech), 2015 International Conference on
  • Type

    conf

  • DOI
    10.1109/CloudTech.2015.7336969
  • Filename
    7336969