DocumentCode
659561
Title
VisReduce: Fast and responsive incremental information visualization of large datasets
Author
Im, Jean-Francois ; Villegas, Felix Giguere ; McGuffln, Michael J.
Author_Institution
Ecole de Technol. Super., Montréal, QC, Canada
fYear
2013
fDate
6-9 Oct. 2013
Firstpage
25
Lastpage
32
Abstract
Performance and responsiveness of visual analytics sytems for exploratory data analysis of large datasets has been a long standing problem. We propose a method for incrementally computing visualizations in a distributed fashion by combining a modified MapReduce-style algorithm with a compressed columnar data store, resulting in significant improvements in performance and responsiveness for constructing commonly encountered information visualizations, e.g. bar charts, scatterplots, heat maps, cartograms and parallel coordinate plots. We compare our method with one that queries three other readily available database and data warehouse systems - PostgreSQL, Cloudera Impala and the MapReduce-based Apache Hive - in order to build visualizations. We show that our end-to-end approach allows for greater speed and guaranteed end-user responsiveness, even in the face of large, long-running queries.
Keywords
SQL; data analysis; data visualisation; data warehouses; query processing; Cloudera Impala; MapReduce-based Apache Hive; MapReduce-style algorithm; PostgreSQL; VisReduce; compressed columnar data store; data warehouse systems; database querying; end-to-end approach; exploratory data analysis; large datasets; responsive incremental information visualization; visual analytics system; Acceleration; Aggregates; Arrays; Data visualization; Databases; Java; Visualization; MapReduce; columnar storage; incremental visualization; information visualization; online aggregation;
fLanguage
English
Publisher
ieee
Conference_Titel
Big Data, 2013 IEEE International Conference on
Conference_Location
Silicon Valley, CA
Type
conf
DOI
10.1109/BigData.2013.6691710
Filename
6691710
Link To Document