• DocumentCode
    1762474
  • Title

    Adaptive Database Schema Design for Multi-Tenant Data Management

  • Author

    Jiacai Ni ; Guoliang Li ; Lijun Wang ; Jianhua Feng ; Jun Zhang ; Lei Li

  • Author_Institution
    Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
  • Volume
    26
  • Issue
    9
  • fYear
    2014
  • fDate
    Sept. 2014
  • Firstpage
    2079
  • Lastpage
    2093
  • Abstract
    Multi-tenant data management is a major application of Software as a Service (SaaS). For example, many companies want to outsource their data to a third party that hosts a multi-tenant database system to provide data management services. The multi-tenant database system needs to have high performance, low space requirement, and excellent scalability. One big challenge is devising a high-quality database schema. Independent Tables Shared Instances (ITSI) and Shared Tables Shared instances (STSI) are two state-of-the-art approaches to designing the schema. However, they suffer from some limitations. ITSI has poor scalability since it needs to maintain large numbers of tables. STSI achieves good scalability at the expense of poor performance and high space overhead. Thus, an effective schema design method that addresses these problems is needed. In this paper, we propose an adaptive database schema design method for multi-tenant applications. We trade-off ITSI and STSI and find a balance between them to achieve good scalability and high performance with low space requirement. To this end, we identify the important attributes and use them to generate an appropriate number of base tables. For the remaining attributes, we construct supplementary tables. We discuss how to use the kernel matrix to determine the number of the base tables, apply graph-partitioning algorithms to construct the base tables, and evaluate the importance of attributes using the well-known PageRank algorithm. We propose a cost-based model to adaptively generate the base tables and supplementary tables. Our method has the following advantages. First, our method achieves high scalability. Second, our method achieves high performance and can trade-off the performance and space requirement. Third, our method can be easily applied to existing databases (e.g., MySQL) with minor revisions. Fourth, our method can adapt to any schemas and query workloads including both OLAP and OLTP applications. Experiment- l results on both real and synthetic datasets show that our method achieves high performance and good scalability with low space requirement and outperforms state-of-the-art methods.
  • Keywords
    cloud computing; database management systems; matrix algebra; outsourcing; query processing; ITSI; MySQL; OLAP application; OLTP application; PageRank algorithm; STSI; SaaS; adaptive base table generation; adaptive database schema design; cost-based model; data outsourcing; graph-partitioning algorithms; high-performance requirement; high-quality database schema; independent tables shared instances; kernel matrix; low-space requirement; multitenant data management; multitenant database system; query workloads; real datasets; scalability issue; shared tables shared instances; software as a service; space overhead; supplementary tables; synthetic datasets; Design methodology; Indexes; Scalability; Servers; Software as a service; Database Applications; Database Management; Information Technology and Systems; SaaS; adaptive schema design; multi-tenant;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2013.94
  • Filename
    6529069