• DocumentCode
    2422335
  • Title

    ApproxMGMSP: A Scalable Method of Mining Approximate Multidimensional Sequential Patterns on Distributed System

  • Author

    Zhang, Changhai ; Hu, Kongfa ; Chen, Zhuxi ; Chen, Ling ; Dong, Yisheng

  • Author_Institution
    Yangzhou Univ., Yangzhou
  • Volume
    2
  • fYear
    2007
  • fDate
    24-27 Aug. 2007
  • Firstpage
    730
  • Lastpage
    734
  • Abstract
    We present a scalable and effective algorithm called ApproxMGMSP (Approximate Mining of Global Multidimensional Sequential Patterns) to solve the problem of mining the multidimensional sequential patterns for large databases in the distributed environment. Our method differs from previous related works of mining multidimensional patterns on distributed system. The main difference is that an approximate mining method is used in large multidimensional sequence database firstly. In this paper, to convert the mining on the multidimensional sequential patterns to sequential patterns, the multidimensional information is embedded into the corresponding sequences. Then the sequences are clustered, summarized, and analyzed on the distributed sites, and the local patterns could be obtained by the effective approximate sequential pattern mining method. Finally, the global multidimensional sequential patterns could be quickly mined by high vote sequential pattern model after collecting all the local patterns on one site. Both the theories and the experiments indicate that this method could simplify the problem of mining the multidimensional sequential patterns and avoid mining the redundant information. The global sequential patterns could be obtained effectively by the scalable method after reducing the cost of communication.
  • Keywords
    data mining; distributed processing; ApproxMGMSP; approximate mining; approximate multidimensional sequential patterns; approximate sequential pattern mining; distributed system; global multidimensional sequential patterns; large databases; multidimensional information; multidimensional sequence database; scalable method; Application software; Computer science; Costs; Data engineering; Data mining; Distributed databases; Itemsets; Multidimensional systems; Pattern analysis; Voting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems and Knowledge Discovery, 2007. FSKD 2007. Fourth International Conference on
  • Conference_Location
    Haikou
  • Print_ISBN
    978-0-7695-2874-8
  • Type

    conf

  • DOI
    10.1109/FSKD.2007.192
  • Filename
    4406172