Title :
Designing the Cloud-Based DOE Systems Biology Knowledgebase
Author :
Lansing, Carina ; Liu, Yan ; Yin, Jian ; Corrigan, Abigail ; Guillen, Zoe ; Van Dam, Kerstin Kleese ; Gorton, Ian
Author_Institution :
Fundamental & Comput. Sci. Div., Pacific Northwest Nat. Lab., Richland, WA, USA
Abstract :
Systems Biology research, even more than many other scientific domains, is becoming increasingly data-intensive. Not only have advances in experimental and computational technologies lead to an exponential increase in scientific data volumes and their complexity, but increasingly such databases are providing the basis for new scientific discoveries. To engage effectively with these community resources, integrated analyses, synthesis and simulation software is needed, supported by scientific workflows. In order to provide a more collaborative, community driven research environment for this heterogeneous setting, the Department of Energy (DOE) has decided to develop a federated, cloud based cyber infrastructure the Systems Biology Knowledgebase (Kbase). In this context the Pacific Northwest National Laboratory (PNNL) has been defining and testing the basic federated cloud-based system architecture and developed a prototype implementation. Community wide accessibility of biological data and the capability to integrate and analyze this data within its changing research context were seen as key technical functionalities the Kbase needs to enable. In this paper we describe the results of our investigations into the design of this cloud based federated infrastructure for: 1) Semantics driven data discovery, access and integration 2) Data annotation, publication and sharing 3) Workflow enabled data analysis 4) Project based collaborative working We describe our approach, exemplary use cases and our prototype implementation that demonstrates the feasibility of this approach.
Keywords :
biology computing; cloud computing; data analysis; groupware; knowledge based systems; Cloud based DOE systems biology knowledgebase; Cloud based cyber infrastructure; Cloud based federated infrastructure; Department of Energy; Kbase; PNNL; Pacific Northwest National Laboratory; Systems Biology Knowledgebase; community driven research environment; community resources integrated analyses; computational technologies; data annotation; project based collaborative working; semantics driven data discovery; simulation software; workflow enabled data analysis; Bioinformatics; Collaboration; Communities; Computer architecture; Genomics; Semantics;
Conference_Titel :
Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), 2011 IEEE International Symposium on
Conference_Location :
Shanghai
Print_ISBN :
978-1-61284-425-1
Electronic_ISBN :
1530-2075
DOI :
10.1109/IPDPS.2011.261