DocumentCode :
441822
Title :
A boosting method to detect noisy data
Author :
Liu, Xiao-Dong ; Shi, Chun-yi ; Gu, Xue-Dao
Author_Institution :
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Shenzhen, China
Volume :
4
fYear :
2005
fDate :
18-21 Aug. 2005
Firstpage :
2015
Abstract :
Noisy data is inherent in the field of data mining. If prior knowledge of such data was available, it would be a simple process to remove or account for noise and improve model robustness. Unfortunately, in the majority of learning situations, the presence of underlying noise is suspected but difficult to detect. Ensemble classification techniques such as bagging, boosting and arcing algorithms have received much attention in recent literature. Such techniques have been shown to lead to reduced classification error on unseen cases, and this paper demonstrates that they may also be employed as noise detectors. In this paper, a brief overview of ensemble methods is presented, and a boosting method based on instance weights and attribute weights information gain is proposed to make boosting method useful for detecting noisy data. The result of experiments on one city endowment insurance database shows this to be a successful approach.
Keywords :
data mining; noise; pattern classification; boosting method; data mining; endowment insurance database; ensemble classification techniques; information gain; noisy data detection; Bagging; Boosting; Computer aided manufacturing; Computer science; Data mining; Detectors; Electronic mail; Machine learning; Noise robustness; Voting; Boosting; Ensemble; Information gain; Noisy data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on
Conference_Location :
Guangzhou, China
Print_ISBN :
0-7803-9091-1
Type :
conf
DOI :
10.1109/ICMLC.2005.1527276
Filename :
1527276
Link To Document :
بازگشت