• DocumentCode
    2753523
  • Title

    A Partition-Based Approach for Sequential Patterns Mining

  • Author

    Nguyen, Son N. ; Orlowska, Maria E.

  • Author_Institution
    Univ. of Queensland, Brisbane, QLD
  • fYear
    2007
  • fDate
    5-9 March 2007
  • Firstpage
    200
  • Lastpage
    205
  • Abstract
    Sequential patterns mining has been explored for various data types, and its computational complexity is well understood. There are well-known methods to deal effectively with computational problems such as GSP [1] and PrefixSpan [2]. However, most methods show limited performance due to the exponential number of growing patterns. Moreover when the input data set is very large, it is unsolvable because of main memory limitation. This paper shows a partition-based approach to overcome this drawback, and to provide further performance enhancements of sequential patterns computation. Furthermore, the partition-based approach can be extended to the parallel paradigm of mining sequential patterns. We have made a series of observations that has led us to invent data pre-processing methods such that the final step of the partition-based algorithm, where a combination of all local candidate patterns must be processed, is executed on substantially smaller input data. This paper shows results from several experiments that confirmed our general and formally presented observations.
  • Keywords
    computational complexity; data mining; pattern recognition; GSP; PrefixSpan; computational complexity; data pre-processing methods; partition-based approach; sequential pattern mining; Australia; Computational complexity; Data mining; Databases; Fault detection; Fault diagnosis; Itemsets; Partitioning algorithms; Pattern analysis; Pattern matching; Algorithm; Data Mining; Sequential Patterns;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Research, Innovation and Vision for the Future, 2007 IEEE International Conference on
  • Conference_Location
    Hanoi
  • Print_ISBN
    1-4244-0694-3
  • Type

    conf

  • DOI
    10.1109/RIVF.2007.369157
  • Filename
    4223074