Title :
Topic models towards high performance data mining and analysis
Author :
Farrahi, Katayoun ; Ferscha, Alois
Author_Institution :
Dept. of Pervasive Comput., Johannes Kepler Univ., Linz, Austria
Abstract :
While unimaginable amounts of data are continuously stored recording our transactions, conversations, connections, movements, behavior, personality, emotions, and opinions, data has been termed “the new oil”. The process of “refine-ment” and knowledge extraction from data is the core of data mining. Advances in automated algorithms and models for extracing knowledge about human behavior will ultimately measure the value of data. This work discusses the use of probabilistic latent topic models, particularly Latent Dirichlet Allocation (LDA) [2], for data mining and explores its application on various sorts of large-scale data, focusing on the advantages and disadvantages of their use. While topic models have been shown to provide a promising new tool for data mining, one current open issue is with respect to developing methods for implementing them in high performance computing platforms.
Keywords :
data analysis; data mining; parallel processing; probability; statistical analysis; LDA; data value measurement; high performance computing platforms; high performance data analysis; high performance data mining; knowledge extraction; latent Dirichlet allocation; probabilistic latent topic models; refinement process; Approximation algorithms; Computational modeling; Data mining; Data models; Educational institutions; Inference algorithms; Pervasive computing;
Conference_Titel :
High Performance Computing and Simulation (HPCS), 2013 International Conference on
Conference_Location :
Helsinki
Print_ISBN :
978-1-4799-0836-3
DOI :
10.1109/HPCSim.2013.6641497