• DocumentCode
    54136
  • Title

    UTOPIAN: User-Driven Topic Modeling Based on Interactive Nonnegative Matrix Factorization

  • Author

    Jaegul Choo ; Changhyun Lee ; Reddy, C.K. ; Park, Heejung

  • Author_Institution
    Georgia Inst. of Technol., Atlanta, GA, USA
  • Volume
    19
  • Issue
    12
  • fYear
    2013
  • fDate
    Dec. 2013
  • Firstpage
    1992
  • Lastpage
    2001
  • Abstract
    Topic modeling has been widely used for analyzing text document collections. Recently, there have been significant advancements in various topic modeling techniques, particularly in the form of probabilistic graphical modeling. State-of-the-art techniques such as Latent Dirichlet Allocation (LDA) have been successfully applied in visual text analytics. However, most of the widely-used methods based on probabilistic modeling have drawbacks in terms of consistency from multiple runs and empirical convergence. Furthermore, due to the complicatedness in the formulation and the algorithm, LDA cannot easily incorporate various types of user feedback. To tackle this problem, we propose a reliable and flexible visual analytics system for topic modeling called UTOPIAN (User-driven Topic modeling based on Interactive Nonnegative Matrix Factorization). Centered around its semi-supervised formulation, UTOPIAN enables users to interact with the topic modeling method and steer the result in a user-driven manner. We demonstrate the capability of UTOPIAN via several usage scenarios with real-world document corpuses such as InfoVis/VAST paper data set and product review data sets.
  • Keywords
    data analysis; data visualisation; interactive systems; matrix decomposition; text analysis; UTOPIAN; flexible visual analytics system; latent Dirichlet allocation; probabilistic graphical modeling; real-world document corpuses; reliable visual analytics system; semisupervised formulation; text document collection analysis; topic modeling method; topic modeling techniques; user feedback; user-driven manner; user-driven topic modeling based on interactive nonnegative matrix factorization; visual text analytics; Analytical models; Computational modeling; Context modeling; Interactive states; Visual analytics; Analytical models; Computational modeling; Context modeling; Interactive states; Latent dirichlet allocation; Visual analytics; interactive clustering; nonnegative matrix factorization; text analytics; topic modeling; visual analytics; Artificial Intelligence; Computer Graphics; Computer Simulation; Image Enhancement; Image Interpretation, Computer-Assisted; Information Storage and Retrieval; Models, Statistical; Natural Language Processing; Pattern Recognition, Automated; Software;
  • fLanguage
    English
  • Journal_Title
    Visualization and Computer Graphics, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1077-2626
  • Type

    jour

  • DOI
    10.1109/TVCG.2013.212
  • Filename
    6634167