DocumentCode :
1806028
Title :
An Empirical Comparison of Four Text Mining Methods
Author :
Lee, Sangno ; Baker, Jeff ; Song, Jaeki ; Wetherbe, James C.
Author_Institution :
Texas Tech Univ., Lubbock, TX, USA
fYear :
2010
fDate :
5-8 Jan. 2010
Firstpage :
1
Lastpage :
10
Abstract :
The amount of textual data that is available for researchers and businesses to analyze is increasing at a dramatic rate. This reality has led IS researchers to investigate various text mining techniques. This essay examines four text mining methods that are frequently used in order to identify their advantages and limitations. The four methods that we examine are (1) latent semantic analysis, (2) probabilistic latent semantic analysis, (3) latent Dirichlet allocation, and (4) the correlated topic model. We compare these four methods and highlight the optimal conditions under which to apply the various methods. Our paper sheds light on the theory that underlies text mining methods and provides guidance for researchers who seek to apply these methods.
Keywords :
data mining; text analysis; IS researchers; correlated topic model; empirical comparison; latent Dirichlet allocation; latent semantic analysis; probabilistic latent semantic analysis; text mining methods; textual data; Advertising; Customer relationship management; Data analysis; Databases; Information systems; Management information systems; Natural languages; Object detection; Text mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
System Sciences (HICSS), 2010 43rd Hawaii International Conference on
Conference_Location :
Honolulu, HI
ISSN :
1530-1605
Print_ISBN :
978-1-4244-5509-6
Electronic_ISBN :
1530-1605
Type :
conf
DOI :
10.1109/HICSS.2010.48
Filename :
5428645
Link To Document :
بازگشت