Title :
On Identifying and Analyzing Significant Nodes in Protein-Protein Interaction Networks
Author :
Khazanchi, Rohan ; Dempsey, Kathryn ; Thapa, Ishwor ; Ali, Hamza
Author_Institution :
Coll. of Inf. Sci. & Technol., Univ. of Nebraska at Omaha, Omaha, NE, USA
Abstract :
Network theory has been used for modeling biological data as well as social networks, transportation logistics, business transcripts, and many other types of data sets. Identifying important features/parts of these networks for a multitude of applications is becoming increasingly significant as the need for big data analysis techniques grows. When analyzing a network of protein-protein interactions (PPIs), identifying nodes of significant importance can direct the user toward biologically relevant network features. In this work, we propose that a node of structural importance in a network model can correspond to a biologically vital or significant property. This relationship between topological and biological importance can be seen in/between structurally defined nodes, such as hub nodes and driver nodes, within a network and within clusters. This work proposes data mining approaches for identification and examination of relationships between hub and driver nodes within human, yeast, rat, and mouse PPI networks. Relationships with other types of significant nodes, with direct neighbors, and with the rest of the network were analyzed to determine if the model can be characterized biologically by its structural makeup. We performed numerous tests on structure with a data-driven mentality, looking for properties that were potentially significant on a network level and then comparing those properties to biological significance. Our results showed that identifying and cross-referencing different types of topologically significant nodes can exemplify properties such as transcription factor enrichment, lethality, clustering, and Gene Ontology (GO) enrichment. Mining the biological networks, we discovered a key relationship between network properties and how sparse/dense a network is-a property we described as "sparseness". Overall, structurally important nodes were found to have significant biological relevance.
Keywords :
biology computing; data mining; proteins; PPI; biological data modeling; business transcripts; data analysis techniques; data driven mentality; data mining; driver nodes; hub nodes; network theory; protein-protein interaction networks; significant node analysis; social networks; structural importance; transcription factor; transportation logistics; Analytical models; Biological system modeling; Computational modeling; Databases; Educational institutions; Proteins; clustering; driver nodes; graph theory; hub nodes; network enrichment; protein-protein interaction networks;
Conference_Titel :
Data Mining Workshops (ICDMW), 2013 IEEE 13th International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4799-3143-9
DOI :
10.1109/ICDMW.2013.126