Title :
Scalable Visual Analytics of Massive Textual Datasets
Author :
Krishnan, M. ; Bohn, S. ; Cowley, W. ; Crow, V. ; Nieplocha, J.
Author_Institution :
Pacific Northwest Nat. Lab., Richland, WA
Abstract :
This paper describes the first scalable implementation of a text processing engine used in visual analytics tools. These tools aid information analysts in interacting with and understanding large textual information content through visual interfaces. By developing a parallel implementation of the text processing engine, we enabled visual analytics tools to exploit cluster architectures and handle massive datasets. The paper describes key elements of our parallelization approach and demonstrates virtually linear scaling when processing multi-gigabyte data sets such as Pubmed. This approach enables interactive analysis of large datasets beyond capabilities of existing state-of-the art visual analytics tools.
Keywords :
data visualisation; graphical user interfaces; parallel algorithms; text analysis; Pubmed; cluster architecture; parallel text processing algorithm; text processing engine; textual information content; visual analytics tool; visual interface; Data analysis; Data visualization; Engines; Information analysis; Information retrieval; Laboratories; Pattern analysis; Scalability; Text processing; Visual analytics;
Conference_Titel :
Parallel and Distributed Processing Symposium, 2007. IPDPS 2007. IEEE International
Conference_Location :
Long Beach, CA
Print_ISBN :
1-4244-0910-1
Electronic_ISBN :
1-4244-0910-1
DOI :
10.1109/IPDPS.2007.370232