• DocumentCode
    679530
  • Title

    Explaining Outliers by Subspace Separability

  • Author

    Micenkova, Barbora ; Xuan-Hong Dang ; Assent, Ira ; Ng, Raymond T.

  • Author_Institution
    Aarhus Univ., Aarhus, Denmark
  • fYear
    2013
  • fDate
    7-10 Dec. 2013
  • Firstpage
    518
  • Lastpage
    527
  • Abstract
    Outliers are extraordinary objects in a data collection. Depending on the domain, they may represent errors, fraudulent activities or rare events that are subject of our interest. Existing approaches focus on detection of outliers or degrees of outlierness (ranking), but do not provide a possible explanation of how these objects deviate from the rest of the data. Such explanations would help user to interpret or validate the detected outliers. The problem addressed in this paper is as follows: given an outlier detected by an existing algorithm, we propose a method that determines possible explanations for the outlier. These explanations are expressed in the form of subspaces in which the given outlier shows separability from the inliers. In this manner, our proposed method complements existing outlier detection algorithms by providing additional information about the outliers. Our method is designed to work with any existing outlier detection algorithm and it also includes a heuristic that gives a substantial speedup over the baseline strategy.
  • Keywords
    data acquisition; data analysis; data collection; inliers; outlier detection algorithms; outlierness degrees; subspace separability; Accuracy; Data visualization; Databases; Detection algorithms; Extraterrestrial measurements; Feature extraction; Gaussian distribution; data exploration; outlier explanation; subspace selection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining (ICDM), 2013 IEEE 13th International Conference on
  • Conference_Location
    Dallas, TX
  • ISSN
    1550-4786
  • Type

    conf

  • DOI
    10.1109/ICDM.2013.132
  • Filename
    6729536