DocumentCode :
3104973
Title :
Converting Output Scores from Outlier Detection Algorithms into Probability Estimates
Author :
Gao, Jing ; Tan, Pang-Ning
Author_Institution :
Dept. of Comput. Sci. & Eng., Michigan State Univ., East Lansing, MI
fYear :
2006
fDate :
18-22 Dec. 2006
Firstpage :
212
Lastpage :
221
Abstract :
Current outlier detection schemes typically output a numeric score representing the degree to which a given observation is an outlier. We argue that converting the scores into well-calibrated probability estimates is more favorable for several reasons. First, the probability estimates allow us to select the appropriate threshold for declaring outliers using a Bayesian risk model. Second, the probability estimates obtained from individual models can be aggregated to build an ensemble outlier detection framework. In this paper, we present two methods for transforming outlier scores into probabilities. The first approach assumes that the posterior probabilities follow a logistic sigmoid function and learns the parameters of the function from the distribution of outlier scores. The second approach models the score distributions as a mixture of exponential and Gaussian probability functions and calculates the posterior probabilites via the Bayes´ rule. We evaluated the efficacy of both methods in the context of threshold selection and ensemble outlier detection. We also show that the calibration accuracy improves with the aid of some labeled examples.
Keywords :
Bayes methods; Gaussian distribution; data handling; exponential distribution; Bayesian risk model; Gaussian probability functions; exponential probability functions; logistic sigmoid function; outlier detection algorithms; outlier scores distribution; posterior probabilities; probability estimates; Bayesian methods; Calibration; Computer science; Costs; Detection algorithms; Logistics; Probability; State estimation; Support vector machine classification; Support vector machines;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2006. ICDM '06. Sixth International Conference on
Conference_Location :
Hong Kong
ISSN :
1550-4786
Print_ISBN :
0-7695-2701-7
Type :
conf
DOI :
10.1109/ICDM.2006.43
Filename :
4053049
Link To Document :
بازگشت