Title :
Online Association Rule Mining over Fast Data
Author :
Olmezogullari, E. ; Ari, I.
Author_Institution :
Comput. Eng. Dept., Ozyegin Univ., Istanbul, Turkey
fDate :
June 27 2013-July 2 2013
Abstract :
To extract useful and actionable information in real-time, the information technology (IT) world is coping with big data problems today. In this paper, we present implementation details and performance results of ReCEPtor, our system for "online" Association Rule Mining (ARM) over big and fast data streams. Specifically, we added Apriori and two different FP-Growth algorithms inside Esper Complex Event Processing (CEP) engine and compared their performances using LastFM social music site data. Our most important findings show that online ARM can generate (1) more unique rules, (2) with higher throughput, and (3) much sooner (lower latency) than offline rule mining. In addition, we have found many interesting and realistic musical preference rules such as "George HarrisonàBeatles". We demonstrate a sustained rate of ~15K rows/sec per core. We hope that our findings can shed light on the design and implementation of other fast data analytics systems in the future.
Keywords :
data mining; music; social networking (online); ARM; Apriori algorithm; CEP engine; Esper complex event processing engine; FP-growth algorithms; IT world; LastFM social music site data; ReCEPtor; big data streams; fast data analytics systems; fast data streams; information technology world; musical preference rules; offline rule mining; online association rule mining; Algorithm design and analysis; Association rules; Engines; Information management; Itemsets; FP-Growth.; Fast data; association rule mining; big data; complex event processing;
Conference_Titel :
Big Data (BigData Congress), 2013 IEEE International Congress on
Conference_Location :
Santa Clara, CA
Print_ISBN :
978-0-7695-5006-0
DOI :
10.1109/BigData.Congress.2013.77