DocumentCode :
3231185
Title :
Intelligent MapReduce Based Framework for Labeling Instances in Evolving Data Stream
Author :
Haque, Ashraful ; Parker, Brendon ; Khan, Latifur ; Thuraisingham, Bhavani
Author_Institution :
Dept. of Comput. Sci., Univ. of Texas at Dallas, Richardson, TX, USA
Volume :
2
fYear :
2013
fDate :
2-5 Dec. 2013
Firstpage :
299
Lastpage :
304
Abstract :
In our current work, we have proposed a multi-tiered ensemble based robust method to address all of the challenges of labeling instances in evolving data stream. Bottleneck of our current work is, it needs to build ADABOOST ensembles for each of the numeric features. This can face scalability issue as number of features can be very large at times in data stream. In this paper, we propose an intelligent approach to build these large number of ADABOOST ensembles with MapReduce based parallelism. We show that, this approach can help our base method to achieve significant scalability without compromising classification accuracy. We analyze different aspects of our design to depict advantages and disadvantages of the approach. We also compare and analyze performance of the proposed approach in terms of execution time, speedup and scale up.
Keywords :
data mining; learning (artificial intelligence); pattern classification; ADABOOST ensembles; MapReduce based parallelism; evolving data stream; instances labeling; intelligent MapReduce based framework; multitiered ensemble based robust method; Accuracy; Data mining; Indexes; Labeling; Measurement; Scalability; Training; Data Mining; Distributed Processing; Scalability;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cloud Computing Technology and Science (CloudCom), 2013 IEEE 5th International Conference on
Conference_Location :
Bristol
Type :
conf
DOI :
10.1109/CloudCom.2013.152
Filename :
6735440
Link To Document :
بازگشت