Title :
Big Data Provenance Analysis and Visualization
Author :
Peng Chen ; Plale, Beth A.
Abstract :
Provenance captured from E-Science experimentation is often large and complex, for instance, from agent-based simulations that have tens of thousands of heterogeneous components interacting over extended time periods. The subject of study of my dissertation is the use of E-Science provenance at scale. My initial research studied the visualization of large provenance graphs and proposed an abstract representation of provenance that supports useful data mining. Recent work involves analyzing large provenance data generated from agent-based simulations on a single machine. In continuation, I propose stream processing techniques to support the continuous and real-time analysis of data provenance, which is captured from agent based simulations on HPC and thus has unprecedented volume and complexity.
Keywords :
Big Data; data mining; data visualisation; digital simulation; multi-agent systems; natural sciences computing; HPC; abstract provenance representation; agent-based simulations; big data provenance analysis; big data visualization; data mining; e-science experimentation; e-science provenance; large provenance data; provenance graphs; Analytical models; Big data; Calibration; Conferences; Data mining; Data models; Data visualization; big data; data provenance; mining; stream processing; visualization;
Conference_Titel :
Cluster, Cloud and Grid Computing (CCGrid), 2015 15th IEEE/ACM International Symposium on
Conference_Location :
Shenzhen
DOI :
10.1109/CCGrid.2015.85