• DocumentCode
    30173
  • Title

    NHOP: A Nested Associative Pattern for Analysis of Consensus Sequence Ensembles

  • Author

    Chiu, David K.Y. ; Lui, Thomas W.H.

  • Author_Institution
    University of Guelph, Guelph
  • Volume
    25
  • Issue
    10
  • fYear
    2013
  • fDate
    Oct. 2013
  • Firstpage
    2314
  • Lastpage
    2324
  • Abstract
    In this research, we introduce a novel, complex associative pattern that is found to be very useful because it identifies the core associative structure from the data. We refer to it as nested high-order pattern. The pattern is more specific than associative patterns represented as multiple variables. It also generalizes sequential patterns, as the outcomes need not be contiguous. This paper outlines two search algorithms, the $(r)$-Tree and Best-$(k)$ algorithm in its detection. It was then applied to an analysis of biomolecule using the aligned sequence family of the molecule. In the SH3 protein, a model for protein-protein interaction mediator, we identify functional groups (core and binding sites) in the three-dimensional structure as well as amino acid patterns dominating certain species.
  • Keywords
    Algorithm design and analysis; Compounds; Educational institutions; Mutual information; Proteins; Statistical analysis; Tin; Algorithm design and analysis; Classifier design and evaluation; Compounds; Educational institutions; Mutual information; Proteins; Statistical analysis; Tin; bioinformatics; data mining; granular computing; pattern analysis;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2012.151
  • Filename
    6261312