Title :
A novel application of Hoeffding´s inequality to decision trees construction for data streams
Author :
Duda, Piotr ; Jaworski, M. ; Pietruczuk, Lena ; Rutkowski, Leszek
Author_Institution :
Inst. of Comput. Intell., Czestochowa Univ. of Technol., Czestochowa, Poland
Abstract :
Decision trees are the commonly applied tools in the task of data stream classification. The most critical point in decision tree construction algorithm is the choice of the splitting attribute. In majority of algorithms existing in literature the splitting criterion is based on statistical bounds derived for split measure functions. In this paper we propose a totally new kind of splitting criterion. We derive statistical bounds for arguments of split measure function instead of deriving it for split measure function itself. This approach allows us to properly use the Hoeffding´s inequality to obtain the required bounds. Based on this theoretical results we propose the Decision Trees based on the Fractions Approximation algorithm (DTFA). The algorithm exhibits satisfactory results of classification accuracy in numerical experiments. It is also compared with other existing in literature methods, demonstrating noticeably better performance.
Keywords :
approximation theory; decision trees; pattern classification; probability; statistical analysis; DTFA; Hoeffding inequality; data stream classification; decision trees based on the fraction approximation algorithm; decision trees construction; split measure function; splitting attribute; splitting criterion; statistical bounds; Accuracy; Approximation algorithms; Decision trees; Entropy; Random variables; Standards;
Conference_Titel :
Neural Networks (IJCNN), 2014 International Joint Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4799-6627-1
DOI :
10.1109/IJCNN.2014.6889806