DocumentCode
2725046
Title
A Dynamic Graph Model for Analyzing Streaming News Documents
Author
Hohman, Elizabeth Leeds ; Marchette, David J.
Author_Institution
Naval Surface Warfare Center, Dahlgren, VA
fYear
2007
fDate
March 1 2007-April 5 2007
Firstpage
462
Lastpage
469
Abstract
In this paper we consider the problem of analyzing streaming documents, in particular streaming news stories. The system is designed to extract statistics from the document, incorporate these into a graph-based model, and discard the document to reduce storage requirements. The model is defined in terms of a changing lexicon and sub-lexicons at each node in the graph, with the nodes of the graph representing topics. An approximation to the TFIDF term weighting is introduced. We illustrate the methodology on a dataset of news articles, and discuss the dynamic nature of the model
Keywords
document handling; graph theory; TFIDF term weighting; analyzing streaming news documents; dynamic graph model; lexicon; Computational intelligence; Data mining; Electronic mail; Feeds; Frequency; Keyboards; Statistics; Text categorization; Text processing; Traffic control;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Intelligence and Data Mining, 2007. CIDM 2007. IEEE Symposium on
Conference_Location
Honolulu, HI
Print_ISBN
1-4244-0705-2
Type
conf
DOI
10.1109/CIDM.2007.368911
Filename
4221335
Link To Document