DocumentCode :
2357684
Title :
A software infrastructure for research in textual data mining
Author :
Holzman, Lars E. ; Fisher, Todd A. ; Galitsky, Leon M. ; Kontostathis, April ; Pottenger, William M.
Author_Institution :
Dept. of Comput. Sci. & Eng., Lehigh Univ., USA
fYear :
2003
fDate :
3-5 Nov. 2003
Firstpage :
112
Lastpage :
121
Abstract :
Few tools exist that address the challenges facing researchers in the textual data mining (TDM) field. Some are too specific to their application, or are prototypes not suitable for general use. More general tools often are not capable of processing large volumes of data. We have created a textual data mining infrastructure (TMI) that incorporates both existing and new capabilities in a reusable framework conductive to developing new tools and components. TMI adheres to strict guidelines that allow it to run in a wide range of processing environments - as a result, it accommodates the volume of computing and diversity of research occurring in TDM. A unique capability of TMI is support for optimization. This facilitates text mining research by automating the search for optimal parameters in text mining algorithms. In this article we describe a number of applications that use the TMI. We present several novel results that have not been published elsewhere. We also discuss how the TMI utilizes existing machine-learning libraries, thereby enabling researchers to continue and extend their endeavors with minimal effort. Towards that end, TMI is available on the web at hddi.cse.lehigh.edu.
Keywords :
data mining; learning (artificial intelligence); software architecture; text analysis; World Wide Web; machine learning library; optimal parameter; optimization; research diversity; search automation; software infrastructure; textual data mining infrastructure; Application software; Computer science; Data engineering; Data mining; Design engineering; Libraries; Machine learning algorithms; Prototypes; Text mining; Time division multiplexing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Tools with Artificial Intelligence, 2003. Proceedings. 15th IEEE International Conference on
ISSN :
1082-3409
Print_ISBN :
0-7695-2038-3
Type :
conf
DOI :
10.1109/TAI.2003.1250178
Filename :
1250178
Link To Document :
بازگشت