• DocumentCode
    2865584
  • Title

    Fast frequent string mining using suffix arrays

  • Author

    Fischer, Johannes ; Heun, Volker ; Kramer, Stefan

  • Author_Institution
    Inst. fur Informatik, Univ. Miinchen Amalienstr, Germany
  • fYear
    2005
  • fDate
    27-30 Nov. 2005
  • Abstract
    We present a method to mine strings that are frequent in one database and infrequent in another. The method uses suffix- and lcp-arrays that can be computed extremely fast and space efficiently, and further exhibit a good locality behavior. Experiments with several biologically relevant data sets show that our approach outperforms existing methods in terms of time and space.
  • Keywords
    computational complexity; data mining; string matching; fast frequent string mining; lcp arrays; suffix arrays; Biology computing; Computational biology; Data mining; Frequency; Lattices; Sequences; Spatial databases;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, Fifth IEEE International Conference on
  • ISSN
    1550-4786
  • Print_ISBN
    0-7695-2278-5
  • Type

    conf

  • DOI
    10.1109/ICDM.2005.62
  • Filename
    1565738