DocumentCode :
3681506
Title :
Realtime file processing based on Map-Reduce framework
Author :
George Cabău;Andrea Timea Sălăgean;Gheorghe Sebestyen-Pal
Author_Institution :
Bitdefender, Technical University of Cluj-Napoca, Romania
fYear :
2015
Firstpage :
537
Lastpage :
543
Abstract :
Every day we find hundreds of thousands of new malicious samples. Among them there are a lot of clean files. Deciding which file is infected and which is clean requires intensive processing. Handling such volumes of files and extracted metadata demanded a distributed system. Based on MapReduce, a concept proposed by Google and used by many others companies like Yahoo! and Facebook, we developed a file processing system which will try to fulfill our need of sample processing. The system is able to use hardware computing systems with different hardware configuration to run a series of different tasks and will automatically adjust them on every hardware system. We use a cascade of map and reduce tasks for extracting and processing the data and a key-value RAM database as data link between them. In order to be able to prioritize some task over the others, we created an algorithm which will try to favor a task with higher priority in disfavor of a one with lower priority when system runs at full capacity, trying to balance the cost of moving the same data over the network multiple times. Reliability and horizontal scalability are also things that we took into consideration when designing the system. Having one or multiple hardware failures will not affect the system and adding more hardware systems will have a linear impact.
Keywords :
"Program processors","Databases","Data mining","Hardware","Random access memory","Reliability","Programming"
Publisher :
ieee
Conference_Titel :
Intelligent Computer Communication and Processing (ICCP), 2015 IEEE International Conference on
Type :
conf
DOI :
10.1109/ICCP.2015.7312716
Filename :
7312716
Link To Document :
بازگشت