Title :
Fast similarity matching on data stream with noise
Author :
Peng, Zou ; Liang, Su ; Yan, Jia ; WeiHong, Han ; ShuQiang, Yang
Author_Institution :
Sch. of Comput. Sci. Nat., Univ. of Defense Technol., Changsha
Abstract :
Data stream has attracted many researchers from various communities (network, database and data mining). There are a variety of techniques for solving the similarity matching in time series datasets. However, subsequence matching over data stream, finding those subsequences which are similar to a query sequence in a progressive and real-time fashion, is a challenging and novel problem due to the high speed, large quantity, potentially unbounded and evolving stream data. In this paper, firstly, we design a bound technique to prune the unnecessary computation as much as possible. Then, a novel algorithm is proposed which can identify all matched subsequences from data stream under the DTW (Dynamic Time Warping) distance in a "single pass". Furthermore, our experiments on synthetic and real data show that the proposed method is at least 3 times faster than the existing algorithm: SPRING, only increasing several extra bytes.
Keywords :
data mining; database management systems; noise; pattern matching; query processing; data mining; data stream; database system; dynamic time warping; pattern discovery; query sequence; subsequence similarity matching; time series dataset; Communications technology; Computer science; Data mining; Databases; Hardware; Intrusion detection; Military communication; Monitoring; Sampling methods; Springs;
Conference_Titel :
Data Engineering Workshop, 2008. ICDEW 2008. IEEE 24th International Conference on
Conference_Location :
Cancun
Print_ISBN :
978-1-4244-2161-9
Electronic_ISBN :
978-1-4244-2162-6
DOI :
10.1109/ICDEW.2008.4498316