Title :
Semi-supervised Based Training Set Construction for Outlier Detection
Author :
Xu Zhou ; Pengpeng Zhao ; Yuanliu Liu ; Zhiming Cui
Author_Institution :
Sch. of Comput. Sci. & Technol., Soochow Univ., Suzhou, China
Abstract :
Outliers are sparse and few. It´s costly to obtain a training set with enough outliers so that existing approaches to the problem of outlier detection seldom processed with supervised manner. However, given a training set with sufficient outliers, supervised outlier detection perform better than other methods. Traditional training set need to label each sample, but we can only label out the outliers and the other unlabeled ones can be directly marked as inliers to construct training set. In most cases, the number of samples we can label is limited and a large number of samples can be easily obtained without labeling. Semi-Supervised learning methods have a nature advantage in utilizing information of little labeled samples and large unlabeled samples to predict unlabeled instances. Based on this idea, we propose a algorithm CRLC constructing training set combining semi-supervised outlier detection. Our experiments show that our algorithm achieves better performance compared to other methods with the same cost.
Keywords :
data mining; learning (artificial intelligence); set theory; data mining; outlier detection; semisupervised based training set construction; semisupervised learning methods; supervised manner; supervised outlier detection; unlabeled instances; Classification algorithms; Clustering algorithms; Detection algorithms; Educational institutions; Labeling; Testing; Training; Outlier Detection; Pattern Recognition; Semi-Supervised Learning;
Conference_Titel :
Cloud Computing and Big Data (CloudCom-Asia), 2013 International Conference on
Conference_Location :
Fuzhou
Print_ISBN :
978-1-4799-2829-3
DOI :
10.1109/CLOUDCOM-ASIA.2013.96