DocumentCode
3077611
Title
Confuga: Scalable Data Intensive Computing for POSIX Workflows
Author
Donnelly, Patrick ; Hazekamp, Nicholas ; Thain, Douglas
Author_Institution
Dept. of Comput. Sci. & Eng., Univ. of Notre Dame, Notre Dame, ID, USA
fYear
2015
fDate
4-7 May 2015
Firstpage
392
Lastpage
401
Abstract
Today´s big-data analysis systems achieve performance and scalability by requiring end users to embrace a novel programming model. This approach is highly effective whose the objective is to compute relatively simple functions on colossal amounts of data, but it is not a good match for a scientific computing environment which depends on complex applications written for the conventional POSIX environment. To address this gap, we introduce Conjugal, a scalable data-intensive computing system that is largely compatible with the POSIX environment. Conjugal brings together the workflow model of scientific computing with the storage architecture of other big data systems. Conjugal accepts large workflows of standard POSIX applications arranged into graphs, and then executes them in a cluster, exploiting both parallelism and data-locality. By making use of the workload structure, Conjugal is able to avoid the long-standing problems of metadata scalability and load instability found in many large scale computing and storage systems. We show that CompUSA´s approach to load control offers improvements of up to 228% in cluster network utilization and 23% reductions in workflow execution time.
Keywords
Big Data; natural sciences computing; operating systems (computers); parallel processing; storage management; Big Data systems; Confuga; POSIX workflows; active storage cluster file system; data-intensive computing system; data-locality; parallelism; scientific computing; storage architecture; Bioinformatics; Chirp; Computer architecture; Genomics; Protocols; Semantics; Servers;
fLanguage
English
Publisher
ieee
Conference_Titel
Cluster, Cloud and Grid Computing (CCGrid), 2015 15th IEEE/ACM International Symposium on
Conference_Location
Shenzhen
Type
conf
DOI
10.1109/CCGrid.2015.95
Filename
7152505
Link To Document