DocumentCode
2404230
Title
A framework towards efficient and effective sequence clustering
Author
Wang, Wei ; Yang, Jiong
Author_Institution
IBM Thomas J. Watson Res. Center, NY, USA
fYear
2002
fDate
2002
Firstpage
282
Abstract
Analyzing sequence data (particularly in categorical domains) has become increasingly important, partially due to the significant advances in biology and other fields. Examples of sequence data include DNA sequences, unfolded protein sequences, text documents, Web usage data, system traces, etc. Previous work on mining sequence data has mainly focused on frequent pattern discovery. In this project, we focus on the problem of clustering sequence data
Keywords
data analysis; pattern clustering; sequences; DNA sequences; Web usage data; categorical domains; sequence data analysis; sequence data clustering; system traces; text documents; unfolded protein sequences; Amino acids; Biological information theory; Clustering algorithms; DNA; Data analysis; Data mining; Extraterrestrial measurements; Probability distribution; Protein sequence; Tree data structures;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Engineering, 2002. Proceedings. 18th International Conference on
Conference_Location
San Jose, CA
ISSN
1063-6382
Print_ISBN
0-7695-1531-2
Type
conf
DOI
10.1109/ICDE.2002.994736
Filename
994736
Link To Document