• DocumentCode
    3078584
  • Title

    Association rule based frequent pattern mining in biological sequences

  • Author

    Salim, Azzeddine ; Chandra, S. S. Vinod

  • Author_Institution
    Dept. of Comput.-Sci. & Eng., Coll. of Eng., Thiruvananthapuram, India
  • fYear
    2013
  • fDate
    26-28 Dec. 2013
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    To find all frequent patterns present in a set of strings is computationally intensive. An exhaustive search, where every possible candidate is taken into consideration, is not practical for larger pattern widths due to exponential computational complexity. Other approaches apply heuristics, where algorithm tries to reduce search space, but may compromise the accuracy of results to certain extent. We used modified Apriori algorithm to mine possible patterns in a very long sequence, especially most frequent substring pattern of a fixed length in biological sequence. The algorithm gives good performance by rapid reduction in search space, and computations using bit-wise operations instead of expensive string comparison operations. This algorithm outperform existing pattern finding methods such as MEME in terms of execution time.
  • Keywords
    biology computing; data mining; genomics; string matching; association rule based frequent pattern mining; biological sequences; bit-wise operations; execution time; modified apriori algorithm; most-frequent substring pattern; search space reduction; Algorithm design and analysis; Bioinformatics; Databases; Generators; Genomics; Pattern matching; Apriori; Genomic Sequences; Most Frequent Pattern;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Computing Research (ICCIC), 2013 IEEE International Conference on
  • Conference_Location
    Enathi
  • Print_ISBN
    978-1-4799-1594-1
  • Type

    conf

  • DOI
    10.1109/ICCIC.2013.6724203
  • Filename
    6724203