• DocumentCode
    1910773
  • Title

    Frequent Approximate Substring Pattern Mining Using Locally Optimal Occurrence Counting

  • Author

    Nakamura, Atsuyoshi ; Tosaka, Hisashi ; Kudo, Mineichi

  • Author_Institution
    Hokkaido Univ., Sapporo, Japan
  • fYear
    2012
  • fDate
    20-22 Sept. 2012
  • Firstpage
    54
  • Lastpage
    59
  • Abstract
    We propose a novel frequent approximate pattern mining in which occurrences themselves are valuable regions to extract. Given a string s, our mining task is to enumerate its sub strings that locally optimally match many sub strings of s. We show an algorithm for this problem whose time and space complexities are O(n3) for a string with length n. According to our experiments using synthetic data, our algorithm can correctly extract embedded frequent patterns as both patterns and their occurrences even if the embedded patterns are corrupted by random edit operations.
  • Keywords
    computational complexity; data mining; embedded frequent pattern extraction; frequent approximate substring pattern mining; locally optimal occurrence counting; mining task; random edit operation; space complexity; synthetic data; time complexity; Approximation algorithms; Complexity theory; Data mining; Electronic mail; Hamming distance; Optimized production technology; Pattern matching; data mining; frequent pattern; local alignment; string algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advanced Applied Informatics (IIAIAAI), 2012 IIAI International Conference on
  • Conference_Location
    Fukuoka
  • Print_ISBN
    978-1-4673-2719-0
  • Type

    conf

  • DOI
    10.1109/IIAI-AAI.2012.20
  • Filename
    6337157