• DocumentCode
    140860
  • Title

    Automatic entity-grouping for OLTP workloads

  • Author

    Bin Liu ; Tatemura, J. ; Po, Oliver ; Wang-Pin Hsiung ; Hacigumus, H.

  • Author_Institution
    NEC Labs. America, Cupertino, CA, USA
  • fYear
    2014
  • fDate
    March 31 2014-April 4 2014
  • Firstpage
    712
  • Lastpage
    723
  • Abstract
    Supporting an online transaction processing (OLTP) workload in a scalable and elastic fashion is a challenging task. Recently, a new breed of scalable systems have shown significant throughput gains by limiting consistency to small units of data called “entity-groups” (e.g., a user´s account information stored together with all her emails in an online email service.) Transactions that access the data from only one entity-group are guaranteed of full ACID, but those that access multiple entity-groups are not. Defining entity-groups has direct impact on workload consistency and performance, and doing so for data with a complex schema is very challenging. It is prone to go to extremes - groups that are too fine-grained cause excessive number of expensive distributed transactions while those that are too coarse lead to excessive serialization and performance degradation. It is also difficult to balance conflicting requirements from different transactions. In commercially available entity-group systems, creating entity-groups is usually a manual process, which severely limits the usability of those systems. This paper is the first systematic effort on automating the entity-group design process. Our goal is to build a user-friendly design tool for automatically creating entity-groups based on a given workload and to help users trade consistency for performance in a principled manner. For advanced users, we allow them to provide feedback to the entity-group design and iteratively improve the final output. We demonstrate the effectiveness of our approach with widely used benchmarks. We also present the user experience of a prototype we built.
  • Keywords
    data mining; iterative methods; transaction processing; OLTP workloads; automatic entity-grouping; expensive distributed transactions; iterative method; multiple entity-groups; online transaction processing workload; user-friendly design tool; Benchmark testing; Distributed databases; Electronic mail; Measurement; Prototypes; Throughput;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering (ICDE), 2014 IEEE 30th International Conference on
  • Conference_Location
    Chicago, IL
  • Type

    conf

  • DOI
    10.1109/ICDE.2014.6816694
  • Filename
    6816694