Adaptive Database Schema Design for Multi-Tenant Data Management

Author

Jiacai Ni ; Guoliang Li ; Lijun Wang ; Jianhua Feng ; Jun Zhang ; Lei Li

Author_Institution

Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China

Volume

Issue

fYear

2014

fDate

Sept. 2014

Firstpage

2079

Lastpage

2093

Abstract

Multi-tenant data management is a major application of Software as a Service (SaaS). For example, many companies want to outsource their data to a third party that hosts a multi-tenant database system to provide data management services. The multi-tenant database system needs to have high performance, low space requirement, and excellent scalability. One big challenge is devising a high-quality database schema. Independent Tables Shared Instances (ITSI) and Shared Tables Shared instances (STSI) are two state-of-the-art approaches to designing the schema. However, they suffer from some limitations. ITSI has poor scalability since it needs to maintain large numbers of tables. STSI achieves good scalability at the expense of poor performance and high space overhead. Thus, an effective schema design method that addresses these problems is needed. In this paper, we propose an adaptive database schema design method for multi-tenant applications. We trade-off ITSI and STSI and find a balance between them to achieve good scalability and high performance with low space requirement. To this end, we identify the important attributes and use them to generate an appropriate number of base tables. For the remaining attributes, we construct supplementary tables. We discuss how to use the kernel matrix to determine the number of the base tables, apply graph-partitioning algorithms to construct the base tables, and evaluate the importance of attributes using the well-known PageRank algorithm. We propose a cost-based model to adaptively generate the base tables and supplementary tables. Our method has the following advantages. First, our method achieves high scalability. Second, our method achieves high performance and can trade-off the performance and space requirement. Third, our method can be easily applied to existing databases (e.g., MySQL) with minor revisions. Fourth, our method can adapt to any schemas and query workloads including both OLAP and OLTP applications. Experiment- l results on both real and synthetic datasets show that our method achieves high performance and good scalability with low space requirement and outperforms state-of-the-art methods.

Keywords

cloud computing; database management systems; matrix algebra; outsourcing; query processing; ITSI; MySQL; OLAP application; OLTP application; PageRank algorithm; STSI; SaaS; adaptive base table generation; adaptive database schema design; cost-based model; data outsourcing; graph-partitioning algorithms; high-performance requirement; high-quality database schema; independent tables shared instances; kernel matrix; low-space requirement; multitenant data management; multitenant database system; query workloads; real datasets; scalability issue; shared tables shared instances; software as a service; space overhead; supplementary tables; synthetic datasets; Design methodology; Indexes; Scalability; Servers; Software as a service; Database Applications; Database Management; Information Technology and Systems; SaaS; adaptive schema design; multi-tenant;

fLanguage

English

Journal_Title

Knowledge and Data Engineering, IEEE Transactions on

Publisher

ieee

ISSN

1041-4347

Type

jour

DOI

10.1109/TKDE.2013.94

Filename

6529069

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=1762474