• DocumentCode
    3119503
  • Title

    An Empirical Comparison of Dissimilarity Measures for Time Series Classification

  • Author

    Giusti, Roberto ; Batista, Gustavo E. A. P. A.

  • Author_Institution
    Inst. de Cienc. Mat. e de Comput., Univ. de Sao Paulo, Sao Carlos, Brazil
  • fYear
    2013
  • fDate
    19-24 Oct. 2013
  • Firstpage
    82
  • Lastpage
    88
  • Abstract
    Distance and dissimilarity functions are of undoubted importance to Time Series Data Mining. There are literally hundreds of methods proposed in the literature that rely on a dissimilarity measure as the main manner to compare objects. One notable example is the 1-Nearest Neighbor classification algorithm. These methods frequently outperform more complex methods in tasks such as classification, clustering, prediction, and anomaly detection. All these methods leave open the distance or dissimilarity function, being Euclidean distance (ED) and Dynamic Time Warping (DTW) the two most used dissimilarity measures in the literature. This paper empirically compares 48 measures on 42 time series data sets. Our objective is to call the attention of the research community about other dissimilarity measures besides ED and DTW, some of them able to significantly outperform these measures in classification. Our results show that Complex Invariant Distance DTW (CIDDTW) significantly outperforms DTW and that CIDDTW, DTW, CID, Minkowski L-p (p-norm difference with data set-crafted "p" parameter), Lorentzian L-infinity, Manhattan L-1, Average L-1/L-infinity (arithmetic average), Dice distance, and Jaccard distance outperform ED, but only CIDDTW, DTW, and CID outperform ED with statistical significance.
  • Keywords
    data mining; pattern classification; statistical analysis; time series; 1-nearest neighbor classification; CIDDTW; Euclidean distance; Minkowski L-p; complex invariant distance DTW; dissimilarity functions; dissimilarity measures; distance functions; dynamic time warping; empirical comparison; p-norm difference; statistical significance; time series classification; time series data mining; Accuracy; Complexity theory; Equations; Euclidean distance; Testing; Time measurement; Time series analysis; classification; dissimilarity measures; time series;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Systems (BRACIS), 2013 Brazilian Conference on
  • Conference_Location
    Fortaleza
  • Type

    conf

  • DOI
    10.1109/BRACIS.2013.22
  • Filename
    6726430