Title :
A Robust and Scalable Solution for Interpolative Multidimensional Scaling with Weighting
Author :
Yang Ruan ; Fox, G.
Author_Institution :
Sch. of Inf. & Comput., Indiana Univ., Bloomington, IN, USA
Abstract :
Advances in modern bio-sequencing techniques have led to a proliferation of raw genomic data that enables an unprecedented opportunity for data mining. To analyze such large volume and high-dimensional scientific data, many high performance dimension reduction and clustering algorithms have been developed. Among the known algorithms, we use Multidimensional Scaling (MDS) to reduce the dimension of original data and Pair wise Clustering, and to classify the data. We have shown that interpolative MDS, which is an online technique for real-time streaming in Big Data, can be applied to get better performance on massive data. However, SMACOF MDS approach is only directly applicable to cases where all pair wise distances are used and where weight is one for each term. In this paper, we proposed a robust and scalable MDS and interpolation algorithm using Deterministic Annealing technique, to solve problems with either missing distances or a non-trivial weight function. We compared our method to three state-of-art techniques. By experimenting on three common types of bioinformatics dataset, the results illustrate that the precision of our algorithms are better than other algorithms, and the weighted solutions has a lower computational time cost as well.
Keywords :
biology computing; data mining; deterministic algorithms; genomics; interpolation; pattern clustering; biosequencing techniques; clustering algorithms; data mining; deterministic annealing technique; genomic data; high performance dimension reduction algorithms; interpolative MDS; interpolative multidimensional scaling; pairwise clustering; Conferences; Deterministic Annealing; Multidimensional Scaling;
Conference_Titel :
eScience (eScience), 2013 IEEE 9th International Conference on
Conference_Location :
Beijing
DOI :
10.1109/eScience.2013.30