• DocumentCode
    2404230
  • Title

    A framework towards efficient and effective sequence clustering

  • Author

    Wang, Wei ; Yang, Jiong

  • Author_Institution
    IBM Thomas J. Watson Res. Center, NY, USA
  • fYear
    2002
  • fDate
    2002
  • Firstpage
    282
  • Abstract
    Analyzing sequence data (particularly in categorical domains) has become increasingly important, partially due to the significant advances in biology and other fields. Examples of sequence data include DNA sequences, unfolded protein sequences, text documents, Web usage data, system traces, etc. Previous work on mining sequence data has mainly focused on frequent pattern discovery. In this project, we focus on the problem of clustering sequence data
  • Keywords
    data analysis; pattern clustering; sequences; DNA sequences; Web usage data; categorical domains; sequence data analysis; sequence data clustering; system traces; text documents; unfolded protein sequences; Amino acids; Biological information theory; Clustering algorithms; DNA; Data analysis; Data mining; Extraterrestrial measurements; Probability distribution; Protein sequence; Tree data structures;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2002. Proceedings. 18th International Conference on
  • Conference_Location
    San Jose, CA
  • ISSN
    1063-6382
  • Print_ISBN
    0-7695-1531-2
  • Type

    conf

  • DOI
    10.1109/ICDE.2002.994736
  • Filename
    994736