Title :
Open research challenges with Big Data ? A data-scientist´s perspective
Author :
Sreenivas R. Sukumar
Author_Institution :
Computational Sciences and Engineering Division, Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, TN, 37831, USA
Abstract :
In this paper, we discuss data-driven discovery challenges of the Big Data era. We observe that recent innovations in being able to collect, access, organize, integrate, and query massive amounts of data from a wide variety of data sources have brought statistical data mining and machine learning under more scrutiny and evaluation for gleaning insights from the data than ever before. In that context, we pose and debate the question - Are data mining algorithms scaling with the ability to store and compute? If yes, how? If not, why not? We survey recent developments in the state-of-the-art to discuss emerging and outstanding challenges in the design and implementation of machine learning algorithms at scale. We leverage experience from real-world Big Data knowledge discovery projects across domains of national security, healthcare and manufacturing to suggest our efforts be focused along the following axes: (i) the `data science´ challenge - designing scalable and flexible computational architectures for machine learning (beyond just data-retrieval); (ii) the ` science of data´ challenge - the ability to understand characteristics of data before applying machine learning algorithms and tools; and (iii) the `scalable predictive functions´ challenge - the ability to construct, learn and infer with increasing sample size, dimensionality, and categories of labels. We conclude with a discussion of opportunities and directions for future research.
Keywords :
"Big data","Algorithm design and analysis","Mathematical model","Machine learning algorithms","Medical services","Manufacturing","Prediction algorithms"
Conference_Titel :
Big Data (Big Data), 2015 IEEE International Conference on
DOI :
10.1109/BigData.2015.7363882