DocumentCode :
32870
Title :
Modeling and Optimization for Big Data Analytics: (Statistical) learning tools for our era of data deluge
Author :
Slavakis, Konstantinos ; Giannakis, Georgios ; Mateos, Gonzalo
Author_Institution :
Digital Technol. Center (DTC), Univ. of Minnesota, Minneapolis, MN, USA
Volume :
31
Issue :
5
fYear :
2014
fDate :
Sept. 2014
Firstpage :
18
Lastpage :
31
Abstract :
With pervasive sensors continuously collecting and storing massive amounts of information, there is no doubt this is an era of data deluge. Learning from these large volumes of data is expected to bring significant science and engineering advances along with improvements in quality of life. However, with such a big blessing come big challenges. Running analytics on voluminous data sets by central processors and storage units seems infeasible, and with the advent of streaming data sources, learning must often be performed in real time, typically without a chance to revisit past entries. Workhorse signal processing (SP) and statistical learning tools have to be re-examined in todays high-dimensional data regimes. This article contributes to the ongoing cross-disciplinary efforts in data science by putting forth encompassing models capturing a wide range of SP-relevant data analytic tasks, such as principal component analysis (PCA), dictionary learning (DL), compressive sampling (CS), and subspace clustering. It offers scalable architectures and optimization algorithms for decentralized and online learning problems, while revealing fundamental insights into the various analytic and implementation tradeoffs involved. Extensions of the encompassing models to timely data-sketching, tensor- and kernel-based learning tasks are also provided. Finally, the close connections of the presented framework with several big data tasks, such as network visualization, decentralized and dynamic estimation, prediction, and imputation of network link load traffic, as well as imputation in tensor-based medical imaging are highlighted.
Keywords :
Big Data; compressed sensing; data analysis; learning (artificial intelligence); optimisation; principal component analysis; signal sampling; Big Data analytics; CS; DL; PCA; SP-relevant data analytic tasks; big data tasks; central processors; compressive sampling; data deluge era; data science; data-sketching; decentralized estimation; decentralized learning problems; dictionary learning; dynamic estimation; kernel-based learning tasks; large data volumes; modeling; network link load traffic; network visualization; online learning problems; optimization algorithms; pervasive sensors; principal component analysis; quality of life; scalable architectures; signal processing; statistical learning tools; storage units; subspace clustering; tensor-based learning tasks; tensor-based medical imaging; voluminous data sets; Big data; Data models; Data storage; Information technology; Signal processing algorithms; Sparse matrices; Statistical analysis; Storage automation;
fLanguage :
English
Journal_Title :
Signal Processing Magazine, IEEE
Publisher :
ieee
ISSN :
1053-5888
Type :
jour
DOI :
10.1109/MSP.2014.2327238
Filename :
6879577
Link To Document :
بازگشت