Title :
Discovering fuzzy time-interval sequential patterns in sequence databases
Author :
Chen, Yen-Liang ; Huang, Tony Cheng-Kui
Author_Institution :
Dept. of Inf. Manage., Nat. Central Univ., Chung-li, Taiwan
Abstract :
Given a sequence database and minimum support threshold, the task of sequential pattern mining is to discover the complete set of sequential patterns in databases. From the discovered sequential patterns, we can know what items are frequently brought together and in what order they appear. However, they cannot tell us the time gaps between successive items in patterns. Accordingly, Chen et al. have proposed a generalization of sequential patterns, called time-interval sequential patterns, which reveals not only the order of items, but also the time intervals between successive items . An example of time-interval sequential pattern has a form like (A, I2, B, I1, C), meaning that we buy A first, then after an interval of I2 we buy B, and finally after an interval of I1 we buy C, where I2 and I1 are predetermined time ranges. Although this new type of pattern can alleviate the above concern, it causes the sharp boundary problem. That is, when a time interval is near the boundary of two predetermined time ranges, we either ignore or overemphasize it. Therefore, this paper uses the concept of fuzzy sets to extend the original research so that fuzzy time-interval sequential patterns are discovered from databases. Two efficient algorithms, the fuzzy time interval (FTI)-Apriori algorithm and the FTI-PrefixSpan algorithm, are developed for mining fuzzy time-interval sequential patterns. In our simulation results, we find that the second algorithm outperforms the first one, not only in computing time but also in scalability with respect to various parameters.
Keywords :
data mining; fuzzy set theory; temporal databases; data mining; fuzzy set; pattern discovery; sequence database; time-interval sequential pattern; Computational modeling; Data mining; Databases; Educational institutions; Educational programs; Fuzzy sets; Information analysis; Knowledge management; Microphones; Scalability; Data mining; fuzzy sets; sequence data; sequential patterns; time interval; Algorithms; Artificial Intelligence; Cluster Analysis; Databases, Factual; Fuzzy Logic; Information Storage and Retrieval; Numerical Analysis, Computer-Assisted; Pattern Recognition, Automated; Sequence Analysis; Signal Processing, Computer-Assisted; Time Factors;
Journal_Title :
Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on
DOI :
10.1109/TSMCB.2005.847741