Title :
Novel visualization methods for protein data
Author :
Mumtaz, Shahzad ; Nabney, Ian T. ; Flower, Darren
Author_Institution :
Non-Linearity & Complexity Res. Group, Aston Univ., Birmingham, UK
Abstract :
Visualization of high-dimensional data has always been a challenging task. Here we discuss and propose variants of non-linear data projection methods (Generative Topographic Mapping (GTM) and GTM with simultaneous feature saliency (GTM-FS)) that are adapted to be effective on very high-dimensional data. The adaptations use log space values at certain steps of the Expectation Maximization (EM) algorithm and during the visualization process. We have tested the proposed algorithms by visualizing electrostatic potential data for Major Histocompatibility Complex (MHC) class-I proteins. The experiments show that the variation in the original version of GTM and GTM-FS worked successfully with data of more than 2000 dimensions and we compare the results with other linear/nonlinear projection methods: Principal Component Analysis (PCA), Neuroscale (NSC) and Gaussian Process Latent Variable Model (GPLVM).
Keywords :
bioinformatics; biological techniques; data visualisation; expectation-maximisation algorithm; molecular biophysics; proteins; GPLVM comparison; GTM with simultaneous feature saliency; GTM-FS; Gaussian Process Latent Variable Model; Generative Topographic Mapping; MHC class-I proteins; NSC comparison; Neuroscale; PCA comparison; Principal Component Analysis; electrostatic potential data; expectation maximization algorithm; high dimensional data visualization; log space values; major histocompatibility complex; nonlinear data projection methods; protein data visualization methods; Amino acids; Data visualization; Databases; Electric potential; Electrostatics; Mathematical model; Proteins; Visualization; expectation maximization; feature saliency; gaussian process latent variable model; generative topographic mapping; log space; major histocompatibility complex; neuroscale; principal component analysis;
Conference_Titel :
Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), 2012 IEEE Symposium on
Conference_Location :
San Diego, CA
Print_ISBN :
978-1-4673-1190-8
DOI :
10.1109/CIBCB.2012.6217231