DocumentCode :
2989588
Title :
Parallelization of the functional flow algorithm for prediction of protein function using protein-protein interaction networks
Author :
Akkoyun, Emrah ; Can, Tolga
Author_Institution :
Dept. of Med. Inf., Middle East Tech. Univ., Ankara, Turkey
fYear :
2011
fDate :
4-8 July 2011
Firstpage :
56
Lastpage :
62
Abstract :
Protein-protein interaction networks provide important information about functions of proteins. There are various studies which analyze interaction networks and predict functions of novel proteins based on their network connectivity. However, all of these methods are sequential methods that do not utilize high performance computing. Functional flow is one of these methods that uses network connectivity, distance effect, and topology of the network with local and global views to predict protein function. With these advantages, the functional flow algorithm produces more accurate results compared to other techniques. However, due to lack of a parallelized version of the algorithm, the method cannot be practically applied on large scale networks of complex species. In this paper, we provide a parallel implementation of functional flow. We use Hadoop which is one of the open source map/reduce environments. For our experiments, we installed Hadoop on 18 hosts with eight cores each. The first map/reduce job distributes the protein interaction network as a format which allows parallel distributed computing on all the worker nodes. The other map/reduce jobs generate flows for each known protein function and the function of novel proteins are predicted by accumulating all of these generated flows. Our experiments show that the method can be distributed on worker nodes efficiently and the application can provide better performance as the number of resources increases.
Keywords :
bioinformatics; parallel processing; proteins; public domain software; Hadoop; MapReduce; distance effect; functional flow algorithm; high performance computing; network connectivity; network topology; open source map environment; parallel distributed computing; parallel implementation; protein function prediction; protein-protein interaction network; sequential method; Bioinformatics; Computers; Distributed computing; Prediction algorithms; Proteins; Reservoirs; Bioinformatics and Biocomputing; Hadoop; MapReduce; Network Flow; Parallel and Distributed Computing; Protein-Protein Interactions;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing and Simulation (HPCS), 2011 International Conference on
Conference_Location :
Istanbul
Print_ISBN :
978-1-61284-380-3
Type :
conf
DOI :
10.1109/HPCSim.2011.5999807
Filename :
5999807
Link To Document :
بازگشت