• DocumentCode
    659487
  • Title

    Parallel deterministic annealing clustering and its application to LC-MS data analysis

  • Author

    Fox, G. ; Mani, D.R. ; Pyne, Sumanta

  • Author_Institution
    Sch. of Inf. & Comput., Indiana Univ., Bloomington, IN, USA
  • fYear
    2013
  • fDate
    6-9 Oct. 2013
  • Firstpage
    665
  • Lastpage
    673
  • Abstract
    We present a scalable parallel deterministic annealing formalism for clustering with cutoffs and position-dependent variances. We apply it to the “peak matching" problem of the precise identification of the common LC-MS peaks across a cohort of multiple biological samples in proteomic biomarker discovery. We reliably and automatically find tens of thousands of clusters starting with a single one that is split recursively as distance resolution is sharpened. We parallelize the algorithm and compare unconstrained and trimmed clusters using data from a human tuberculosis cohort.
  • Keywords
    data analysis; deterministic algorithms; diseases; health care; pattern clustering; pattern matching; proteomics; LC MS data analysis; distance resolution; human tuberculosis cohort; parallel deterministic annealing clustering; peak matching problem; position dependent variances; proteomic biomarker discovery; scalable parallel deterministic annealing formalism; trimmed clusters; unconstrained clusters; Conferences; Data handling; Data storage systems; Information management; LC-MS; clustering; deterministic annealing; parallel algorithms; performance; proteomics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Big Data, 2013 IEEE International Conference on
  • Conference_Location
    Silicon Valley, CA
  • Type

    conf

  • DOI
    10.1109/BigData.2013.6691636
  • Filename
    6691636