Title : 
A Parallel Implementation of Idea Graph to Extract Rare Chances from Big Data
         
        
            Author : 
Qinyong Wang ; Hao Wang ; Chen Zhang ; Wei Wang ; Zhe Chen ; Fanjiang Xu
         
        
            Author_Institution : 
Sci. & Technol. on Integrated Inf. Syst. Lab., Inst. of Software, Beijing, China
         
        
        
        
        
        
            Abstract : 
In current days, data tend to become much bigger than before, and the distributed computing system is an prevalent option to deal with them. As one of powerful tools, MapReduce framework provides a cheap and efficient way to write parallel programs to run on distributed computing systems. Chance discovery (CD) is an extension of data mining, where chance refers to rare but important events or situations. Idea Graph is an efficient algorithm proposed to detect chances. However, the traditional implementation of Idea Graph is sequential, and its performance encounters some bottlenecks when dealing with big data. In this paper, we propose a parallel implementation of Idea Graph using MapReduce to better meet with the challenge of big data. First, we introduce the MapReduce framework, and then Idea Graph is introduced in brief. After that, we present the details on how we design the parallel Idea Graph implementation. In the end of the paper, several experiments are conducted to evaluate the proposed implementation. The experimental results demonstrate the validation of the proposed implementation and its better performance as compared with that of sequential Idea Graph implementation when handling big data.
         
        
            Keywords : 
Big Data; data handling; data mining; parallel processing; Big Data; CD; IdeaGraph parallel implementation; MapReduce; chance discovery; data mining; rare chance extraction; sequential IdeaGraph; Algorithm design and analysis; Big data; Clustering algorithms; Data mining; Distributed databases; Nickel; Big Data; Chance Discovery; Distributed Computing; IdeaGraph; MapReduce;
         
        
        
        
            Conference_Titel : 
Data Mining Workshop (ICDMW), 2014 IEEE International Conference on
         
        
            Conference_Location : 
Shenzhen
         
        
            Print_ISBN : 
978-1-4799-4275-6
         
        
        
            DOI : 
10.1109/ICDMW.2014.91