• DocumentCode
    3717412
  • Title

    A novel symbolization technique for time-series outlier detection

  • Author

    Gavin Smith;James Goulding

  • Author_Institution
    Horizon Digital Economy Research, The University of Nottingham, UK
  • fYear
    2015
  • Firstpage
    2428
  • Lastpage
    2436
  • Abstract
    The detection of outliers in time series data is a core component of many data-mining applications and broadly applied in industrial applications. In large data sets algorithms that are efficient in both time and space are required. One area where speed and storage costs can be reduced is via symbolization as a pre-processing step, additionally opening up the use of an array of discrete algorithms. With this common pre-processing step in mind, this work highlights that (1) existing symbolization approaches are designed to address problems other than outlier detection and are hence sub-optimal and (2) use of off-the-shelf symbolization techniques can therefore lead to significant unnecessary data corruption and potential performance loss when outlier detection is a key aspect of the data mining task at hand. Addressing this a novel symbolization method is motivated specifically targeting the end use application of outlier detection. The method is empirically shown to outperform existing approaches.
  • Keywords
    "Time series analysis","Quantization (signal)","Linear programming","Ice","Algorithm design and analysis","Indexing","Entropy"
  • Publisher
    ieee
  • Conference_Titel
    Big Data (Big Data), 2015 IEEE International Conference on
  • Type

    conf

  • DOI
    10.1109/BigData.2015.7364037
  • Filename
    7364037