• DocumentCode
    54629
  • Title

    Efficient Sentinel Mining Using Bitmaps on Modern Processors

  • Author

    Middelfart, M. ; Pedersen, Torben Bach ; Krogsgaard, J.

  • Author_Institution
    TARGIT US Inc., Tampa, FL, USA
  • Volume
    25
  • Issue
    10
  • fYear
    2013
  • fDate
    Oct. 2013
  • Firstpage
    2231
  • Lastpage
    2244
  • Abstract
    This paper proposes a highly efficient bitmap-based approach for discovery of so-called sentinels. Sentinels represent schema level relationships between changes over time in certain measures in a multidimensional data cube. Sentinels are actionable and notify users based on previous observations, for example, that revenue might drop within two months if an increase in customer problems combined with a decrease in website traffic is observed. We significantly extend prior art by representing the sentinel mining problem by bitmap operations, using bitmapped encoding of so-called indication streams. We present a very efficient algorithm, SentBit, that is 2-3 orders of magnitude faster than the state of the art, and utilizes CPU specific instructions and the multicore architectures available on modern processors. The SentBit algorithm scales efficiently to very large data sets, which is verified by extensive experiments on both real and synthetic data.
  • Keywords
    Web sites; data mining; encoding; multiprocessing systems; CPU specific instructions; SentBit algorithm scales; Website traffic; bitmap operations; bitmap-based approach; bitmapped encoding; customer problems; indication streams; multicore architectures; multidimensional data cube; schema level relationships; sentinel discovery; sentinel mining; Art; Bidirectional control; Data mining; Databases; Encoding; Organizations; Time measurement; Art; Bidirectional control; Data mining; Databases; Encoding; Organizations; Pattern mining; Time measurement; cube-based data mining; predictive data mining; sentinels;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2012.198
  • Filename
    6329369