Title :
Efficient Sentinel Mining Using Bitmaps on Modern Processors
Author :
Middelfart, M. ; Pedersen, Torben Bach ; Krogsgaard, J.
Author_Institution :
TARGIT US Inc., Tampa, FL, USA
Abstract :
This paper proposes a highly efficient bitmap-based approach for discovery of so-called sentinels. Sentinels represent schema level relationships between changes over time in certain measures in a multidimensional data cube. Sentinels are actionable and notify users based on previous observations, for example, that revenue might drop within two months if an increase in customer problems combined with a decrease in website traffic is observed. We significantly extend prior art by representing the sentinel mining problem by bitmap operations, using bitmapped encoding of so-called indication streams. We present a very efficient algorithm, SentBit, that is 2-3 orders of magnitude faster than the state of the art, and utilizes CPU specific instructions and the multicore architectures available on modern processors. The SentBit algorithm scales efficiently to very large data sets, which is verified by extensive experiments on both real and synthetic data.
Keywords :
Web sites; data mining; encoding; multiprocessing systems; CPU specific instructions; SentBit algorithm scales; Website traffic; bitmap operations; bitmap-based approach; bitmapped encoding; customer problems; indication streams; multicore architectures; multidimensional data cube; schema level relationships; sentinel discovery; sentinel mining; Art; Bidirectional control; Data mining; Databases; Encoding; Organizations; Time measurement; Art; Bidirectional control; Data mining; Databases; Encoding; Organizations; Pattern mining; Time measurement; cube-based data mining; predictive data mining; sentinels;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
DOI :
10.1109/TKDE.2012.198