DocumentCode
659487
Title
Parallel deterministic annealing clustering and its application to LC-MS data analysis
Author
Fox, G. ; Mani, D.R. ; Pyne, Sumanta
Author_Institution
Sch. of Inf. & Comput., Indiana Univ., Bloomington, IN, USA
fYear
2013
fDate
6-9 Oct. 2013
Firstpage
665
Lastpage
673
Abstract
We present a scalable parallel deterministic annealing formalism for clustering with cutoffs and position-dependent variances. We apply it to the “peak matching" problem of the precise identification of the common LC-MS peaks across a cohort of multiple biological samples in proteomic biomarker discovery. We reliably and automatically find tens of thousands of clusters starting with a single one that is split recursively as distance resolution is sharpened. We parallelize the algorithm and compare unconstrained and trimmed clusters using data from a human tuberculosis cohort.
Keywords
data analysis; deterministic algorithms; diseases; health care; pattern clustering; pattern matching; proteomics; LC MS data analysis; distance resolution; human tuberculosis cohort; parallel deterministic annealing clustering; peak matching problem; position dependent variances; proteomic biomarker discovery; scalable parallel deterministic annealing formalism; trimmed clusters; unconstrained clusters; Conferences; Data handling; Data storage systems; Information management; LC-MS; clustering; deterministic annealing; parallel algorithms; performance; proteomics;
fLanguage
English
Publisher
ieee
Conference_Titel
Big Data, 2013 IEEE International Conference on
Conference_Location
Silicon Valley, CA
Type
conf
DOI
10.1109/BigData.2013.6691636
Filename
6691636
Link To Document