DocumentCode :
3473958
Title :
Mining Documents in a Small Enterprise Using WordStat
Author :
Udoh, E. ; Rhoades, J.
Author_Institution :
Dept. of Comput. Sci., Indiana Univ., Fort Wayne, IN
fYear :
2006
fDate :
10-12 April 2006
Firstpage :
490
Lastpage :
494
Abstract :
Text mining is growing as an essential method of knowledge discovery from general and business documents. Although, documents viz. press releases, emails, memos, contracts, government reports and news feeds, are considered to be unstructured, they are tapped for information using text analysis techniques like feature extraction, thematic indexing, clustering and summarization. For this project, 30 representative documents from a small enterprise were collected to determine the dominant features in their activities. Based on the analysis of the document profiles generated by extracting the frequencies of certain terms, clustering and filtering on the basis of both repetitive occurrence and co-occurrence, a coherent picture of the functional relationship among large and heterogeneous lists of terms were obtained. It affords investigators an extractive interface to complex text data. This paper shows how these documents were mined using text-based WordStat software as well as the potentials, features and options of the program
Keywords :
business data processing; data mining; small-to-medium enterprises; text analysis; WordStat; document mining; knowledge discovery; small enterprise; text analysis; Contracts; Data mining; Feature extraction; Feeds; Filtering; Frequency; Government; Indexing; Text analysis; Text mining; Clustering and WordStat.; Cooccurrence; Documents; Text Mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Technology: New Generations, 2006. ITNG 2006. Third International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
0-7695-2497-4
Type :
conf
DOI :
10.1109/ITNG.2006.91
Filename :
1611640
Link To Document :
بازگشت