• DocumentCode
    21025
  • Title

    Overview: The Design, Adoption, and Analysis of a Visual Document Mining Tool for Investigative Journalists

  • Author

    Brehmer, Matthew ; Ingram, Stephen ; Stray, Jonathan ; Munzner, Tamara

  • Author_Institution
    Univ. of British Columbia, Vancouver, BC, Canada
  • Volume
    20
  • Issue
    12
  • fYear
    2014
  • fDate
    Dec. 31 2014
  • Firstpage
    2271
  • Lastpage
    2280
  • Abstract
    For an investigative journalist, a large collection of documents obtained from a Freedom of Information Act request or a leak is both a blessing and a curse: such material may contain multiple newsworthy stories, but it can be difficult and time consuming to find relevant documents. Standard text search is useful, but even if the search target is known it may not be possible to formulate an effective query. In addition, summarization is an important non-search task. We present Overview, an application for the systematic analysis of large document collections based on document clustering, visualization, and tagging. This work contributes to the small set of design studies which evaluate a visualization system “in the wild”, and we report on six case studies where Overview was voluntarily used by self-initiated journalists to produce published stories. We find that the frequently-used language of “exploring” a document collection is both too vague and too narrow to capture how journalists actually used our application. Our iterative process, including multiple rounds of deployment and observations of real world usage, led to a much more specific characterization of tasks. We analyze and justify the visual encoding and interaction techniques used in Overview´s design with respect to our final task abstractions, and propose generalizable lessons for visualization design methodology.
  • Keywords
    data mining; data visualisation; graphical user interfaces; pattern clustering; text analysis; Freedom of Information Act request; Overview; data summarization; document clustering; document collection analysis; document tagging; document visualization; frequently-used language; in-the-wild visualization system; interaction techniques; investigative journalists; iterative process; multiple newsworthy stories; published story production; query processing; self-initiated journalists; standard text search; task abstractions; visual document mining tool adoption; visual document mining tool analysis; visual document mining tool design; visual encoding; visualization design methodology; Data visualization; Document handling; Encoding; Text analysis; Text mining; Design study; investigative journalism; task and requirements analysis; text analysis; text and document data;
  • fLanguage
    English
  • Journal_Title
    Visualization and Computer Graphics, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1077-2626
  • Type

    jour

  • DOI
    10.1109/TVCG.2014.2346431
  • Filename
    6875900