DocumentCode :
2209035
Title :
On Finding Frequent Patterns in Event Sequences
Author :
Campagna, Andrea ; Pagh, Rasmus
Author_Institution :
IT Univ. of Copenhagen, Copenhagen, Denmark
fYear :
2010
fDate :
13-17 Dec. 2010
Firstpage :
755
Lastpage :
760
Abstract :
Given a directed a cyclic graph with labeled vertices, we consider the problem of finding the most common label sequences ("traces") among all paths in the graph (of some maximum length m). Since the number of paths can be huge, we propose novel algorithms whose time complexity depends only on the size of the graph, and on the frequency varepsilon of the most frequent traces. In addition, we apply techniques from streaming algorithms to achieve space usage that depends only on varepsilon, and not on the number of distinct traces. The abstract problem considered models a variety of tasks concerning finding frequent patterns in event sequences. Our motivation comes from working with a data set of 2 million RFID readings from baggage trolleys at Copenhagen Airport. The question of finding frequent passenger movement patterns is mapped to the above problem. We report on experimental findings for this data set.
Keywords :
data mining; directed graphs; pattern clustering; radiofrequency identification; sequences; Copenhagen Airport; RFID readings; abstract problem; baggage trolleys; directed acyclic graph; event sequences; frequent patterns; frequent traces; label sequences; passenger movement patterns; pattern mapping; streaming algorithms; time complexity; algorithms; data mining; graphs; patterns discovery; sampling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining (ICDM), 2010 IEEE 10th International Conference on
Conference_Location :
Sydney, NSW
ISSN :
1550-4786
Print_ISBN :
978-1-4244-9131-5
Electronic_ISBN :
1550-4786
Type :
conf
DOI :
10.1109/ICDM.2010.132
Filename :
5694034
Link To Document :
بازگشت