DocumentCode
2265099
Title
Distributed discovery of asynchronous partial periodic patterns in sequence data using modified periodicity transform
Author
Hsiao, Han-Wen ; Meng-Shu Tsai ; Tsai, Jeffrey J P
Author_Institution
Inst. of Bioinformatics, Taichung Healthcare & Manage. Univ., Taiwan
fYear
2004
fDate
13-15 Dec. 2004
Firstpage
370
Lastpage
377
Abstract
It has been an important task of discovering frequent subsequences as particular patterns from large sequence databases generated from a variety of applications, such as biological sequence analysis. In general, the patterns to be discovered may partially and asynchronously exist in sequences, and even contain gaps. In addition, the locations and frequencies of the patterns may be of interest for the subsequent analysis. How to enumerate candidate patterns for evaluation without exponentially increasing the computation time is another concern. The modified periodicity transform is proposed to meet the requirements mentioned above. The computation time for a synthetic sequence of length 300 K takes 4 seconds to mine all partial periodic patterns of length 5. With minor modification, it is able to handle asynchronous partial periodic patterns of arbitrary length. Note that the approach is in nature suited to distributed environments. A prototype system has been developed in Java for distributed computing. The system could be considered as a feature extractor in an early stage of sequence analysis.
Keywords
Java; biology computing; data mining; pattern recognition; sequences; very large databases; Java; asynchronous partial periodic patterns distributed discovery; biological sequence analysis; distributed computing; distributed environments; feature extractor; modified periodicity transform; prototype system; sequence databases; Application software; Bioinformatics; Biology; Computer science; Data mining; Databases; Information technology; Medical services; Pattern analysis; Technology management;
fLanguage
English
Publisher
ieee
Conference_Titel
Multimedia Software Engineering, 2004. Proceedings. IEEE Sixth International Symposium on
Print_ISBN
0-7695-2217-3
Type
conf
DOI
10.1109/MMSE.2004.42
Filename
1376684
Link To Document