DocumentCode :
3519535
Title :
Biological Data Outlier Detection Based on Kullback-Leibler Divergence
Author :
Oh, Jung Hun ; Gao, Jean ; Rosenblatt, Kevin
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. of Texas, Arlington, TX
fYear :
2008
fDate :
3-5 Nov. 2008
Firstpage :
249
Lastpage :
254
Abstract :
Outlier detection is imperative in biomedical data analysis to achieve reliable knowledge discovery. In this paper, a new outlier detection method based on Kullback-Leibler (KL) divergence is presented. The original concept of KL divergence was designed as a measure of distance between two distributions. Stemming from that, we extend it to biological sample outlier detection by forming sample sets composed of nearest neighbors. To handle the non-linearity during the KL divergence calculation and to tackle with the singularity problem due to small sample size, we map the original data into a higher feature space and apply kernel functions without resorting to a mapping function. A sample possessing the largest KL divergence is detected as an outlier. The proposed method is tested with one synthetic data, two public gene expression data sets, and our own mass spectrometry data generated for prostate cancer study.
Keywords :
biology computing; data mining; medical computing; regression analysis; KL divergence calculation; KL divergence concept; Kullback-Leibler divergence; biological data outlier detection; biological sample outlier detection; biomedical data analysis; distribution distance measure; higher feature space mapping; kernel functions; knowledge discovery; mass spectrometry data; nearest neighbors; prostate cancer study; public gene expression data sets; singularity problem; Bioinformatics; Biology; Clustering algorithms; Data analysis; Intrusion detection; Kernel; Medical diagnostic imaging; Nearest neighbor searches; Object detection; Support vector machines; mass spectrometry; outllier detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bioinformatics and Biomedicine, 2008. BIBM '08. IEEE International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
978-0-7695-3452-7
Type :
conf
DOI :
10.1109/BIBM.2008.76
Filename :
4684899
Link To Document :
بازگشت