• DocumentCode
    2209835
  • Title

    Data Editing Techniques to Allow the Application of Distance-Based Outlier Detection to Streams

  • Author

    Niennattrakul, Vit ; Keogh, Eamonn ; Ratanamahatana, Chotirat Ann

  • Author_Institution
    Dept. of Comput. Eng., Chulalongkorn Univ., Bangkok, Thailand
  • fYear
    2010
  • fDate
    13-17 Dec. 2010
  • Firstpage
    947
  • Lastpage
    952
  • Abstract
    The problem of finding outliers in data has broad applications in areas as diverse as data cleaning, fraud detection, network monitoring, invasive species monitoring, etc. While there are dozens of techniques that have been proposed to solve this problem for static data collections, very simple distance-based outlier detection methods are known to be competitive or superior to more complex methods. However, distance-based methods have time and space complexities that make them impractical for streaming data and/or resource limited sensors. In this work, we show that simple data-editing techniques can make distance-based outlier detection practical for very fast streams and resource limited sensors. Our technique generalizes to produce two algorithms, which, relative to the original algorithm, can guarantee to produce no false positives, or guarantee to produce no false negatives. Our methods are independent of both data type and distance measure, and are thus broadly applicable.
  • Keywords
    data acquisition; text editing; data editing techniques; data streaming; distance-based outlier detection; resource limited sensors; static data collections; Anomaly detection; Data editing; Data stream;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining (ICDM), 2010 IEEE 10th International Conference on
  • Conference_Location
    Sydney, NSW
  • ISSN
    1550-4786
  • Print_ISBN
    978-1-4244-9131-5
  • Electronic_ISBN
    1550-4786
  • Type

    conf

  • DOI
    10.1109/ICDM.2010.56
  • Filename
    5694066