Title :
Design of self-adjusting algorithm for data-intensive MapReduce applications
Author :
Amin Nazir Nagiwale;Manish R. Umale
Author_Institution :
Computer Engineering, Lokamanya Tilak College of Engineering, Navi Mumbai, India
Abstract :
MapReduce framework is suitable for dataintensive applications for large scale processing, but these classes of applications like machine learning algorithms, graph algorithms, sentiment analysis algorithms, etc. have dealt with skewness, diversity of data to adapt changes in real time. For example, it is difficult to adapt to real time changes in training data/corpus for big data applications like Sentiment Analysis, Email spam detection, and log file analysis. To achieve this goal, we have proposed an algorithm that is based on concepts of functional programming and self-adjusting computations that supports effectively accepting changes for system ranging from making training set/ language corpus domain-specific, amortized analysis of algorithm to change in storage, network and architecture design for distributed systems. For experimental purposes, we have implemented Selfie, self -adjusting algorithm with Splay tree for Twitter Sentiment analysis, which makes system responsible for skewness in access pattern and diversity in trends. Proposed algorithm can be helpful for other iterative and interactive applications that faces machine learning challenges like feature generation and selection, over-fitting, explain and improve models to effectively deal with large dynamic data sets.
Keywords :
"Algorithm design and analysis","Vegetation","Sentiment analysis","Market research","Twitter","Machine learning algorithms","Classification algorithms"
Conference_Titel :
Energy Systems and Applications, 2015 International Conference on
DOI :
10.1109/ICESA.2015.7503401