Title :
Sequential Document Visualization
Author :
Mao, Yi ; Dillon, Joshua V. ; Lebanon, Guy
Author_Institution :
Purdue Univ., West Lafayette
Abstract :
Documents and other categorical valued time series are often characterized by the frequencies of short range sequential patterns such as n-grams. This representation converts sequential data of varying lengths to high dimensional histogram vectors which are easily modeled by standard statistical models. Unfortunately, the histogram representation ignores most of the medium and long range sequential dependencies making it unsuitable for visualizing sequential data. We present a novel framework for sequential visualization of discrete categorical time series based on the idea of local statistical modeling. The framework embeds categorical time series as smooth curves in the multinomial simplex summarizing the progression of sequential trends. We discuss several visualization techniques based on the above framework and demonstrate their usefulness for document visualization.
Keywords :
data visualisation; document handling; statistical analysis; time series; discrete categorical time series; histogram vectors; multinomial simplex summarization; sequential document visualization; statistical models; Amino acids; Data visualization; Frequency; Histograms; Multidimensional systems; Principal component analysis; Proteins; Statistics; Time series analysis; Document visualization; local fitting.; multi-resolution analysis; Algorithms; Artificial Intelligence; Computer Graphics; Database Management Systems; Databases, Factual; Documentation; Information Storage and Retrieval; Pattern Recognition, Automated; User-Computer Interface;
Journal_Title :
Visualization and Computer Graphics, IEEE Transactions on
DOI :
10.1109/TVCG.2007.70592