DocumentCode :
120142
Title :
Balancing scalability, performance and fault tolerance for structured data (BSPF)
Author :
Khalid, Amir ; Afzal, Hassan ; Aftab, Shoohira
Author_Institution :
Dept. of Comput. Software Eng., Nat. Univ. of Sci. & Technol., Islamabad, Pakistan
fYear :
2014
fDate :
16-19 Feb. 2014
Firstpage :
725
Lastpage :
732
Abstract :
Analytical business applications generate reports that give a trend predicting insight into the organization´s future, estimating the financial graphs and risk factors. These applications work on huge amounts of data, which comprises of decades of market and company records, and decision logs of an organization. Today, limit of big data is touching zeta-bytes and the structured data makes only 20% of today´s data. 20% of a giga-byte can be ignorable in comparison to big data but 20% of big data itself cannot be neglected. Traditional data management tools are like step-dads when it comes to running cross table analytical queries on structured data in distributed processing environment; response time to these data management tools are high because of the ill-aligned data sets and complex hierarchy of distributed computing environment. Data alignment requires a complete shift in data deployment paradigm from row oriented storage layout to column oriented storage layout, and complex hierarchy of distributed computing environment can be handled by keeping metadata of entire data set. Paper proposes an approach to ease the deployment of structured data into the distributed processing environment by arranging data into column-wise combinational entities. Response time to analytical queries can be lowered with the support of two concepts; Shared architecture and Multi path query execution. Highly scalable systems are Shared Nothing architecture based but degradation in performance and fault tolerance are the side effects that came with high scalability. Proposed method is an effort to balance the equation between scalability, performance and fault tolerance. And due to the limited scope of this paper we concentrate on issues and solutions for structured data only. Shared architecture and active backup helps improving the system´s performance by sharing the work-load-per-node. BSPF´s clustering methodology sheds the data pressure points to minimize the data loss per no- e crash.
Keywords :
Big Data; cloud computing; data structures; fault tolerant computing; pattern clustering; BSPF clustering methodology; active backup; analytical business applications; big data; cloud computing; column oriented storage layout; column-wise combinational entities; data alignment; data deployment paradigm; data loss minimization; data management tools; distributed computing environment; distributed processing environment; fault tolerance balancing; financial graph estimation; ill-aligned data sets; metadata; multipath query execution; performance balancing; risk factor estimation; row oriented storage layout; scalability balancing; shared architecture; shared nothing architecture; structured data deployment; system performance improvement; work-load-per-node sharing; Computer architecture; Computer crashes; Distributed databases; Indexes; Information management; Layout; Peer-to-peer computing; Big data; Distributed and Cloud Computing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Communication Technology (ICACT), 2014 16th International Conference on
Conference_Location :
Pyeongchang
Print_ISBN :
978-89-968650-2-5
Type :
conf
DOI :
10.1109/ICACT.2014.6779058
Filename :
6779058
Link To Document :
بازگشت