DocumentCode
3739771
Title
A Feature Selection Algorithm of Dynamic Data-Stream Based on Hoeffding Inequality
Author
Chunyong Yin;Lu Feng;Luyu Ma;Jin Wang;Zhichao Yin;Jeong-Uk Kim
Author_Institution
Nanjing No.1 Middle Sch., Nanjing, China
fYear
2015
Firstpage
92
Lastpage
95
Abstract
With the rapid development of the Internet, the application of data mining in the Internet is becoming more and more extensive. However, the complex data source´s features are making the data mining process become very inefficient. In order to make data mining more efficient and simple, feature selection research is essential. In this paper, a new metric of mutual information based on mutual information is proposed (measure the correlation degree of the internal features of the collection), additionally Hoeffding inequality is also introduced to construct the HSF algorithm. The HSF is compared with the BIF (based on mutual information feature selection algorithm), the C4.5 classification algorithm is used as the testing algorithm for the experiments. Experiments show that HSF has better performance than BIF [1] in classification accuracy and error rate.
Keywords
"Mutual information","Correlation","Data mining","Machine learning algorithms","Data models","Heuristic algorithms","Filtering theory"
Publisher
ieee
Conference_Titel
Advanced Information Technology and Sensor Application (AITS), 2015 4th International Conference on
Print_ISBN
978-1-4673-7572-6
Type
conf
DOI
10.1109/AITS.2015.32
Filename
7396454
Link To Document