Title :
An Empirical Comparison of Four Text Mining Methods
Author :
Lee, Sangno ; Baker, Jeff ; Song, Jaeki ; Wetherbe, James C.
Author_Institution :
Texas Tech Univ., Lubbock, TX, USA
Abstract :
The amount of textual data that is available for researchers and businesses to analyze is increasing at a dramatic rate. This reality has led IS researchers to investigate various text mining techniques. This essay examines four text mining methods that are frequently used in order to identify their advantages and limitations. The four methods that we examine are (1) latent semantic analysis, (2) probabilistic latent semantic analysis, (3) latent Dirichlet allocation, and (4) the correlated topic model. We compare these four methods and highlight the optimal conditions under which to apply the various methods. Our paper sheds light on the theory that underlies text mining methods and provides guidance for researchers who seek to apply these methods.
Keywords :
data mining; text analysis; IS researchers; correlated topic model; empirical comparison; latent Dirichlet allocation; latent semantic analysis; probabilistic latent semantic analysis; text mining methods; textual data; Advertising; Customer relationship management; Data analysis; Databases; Information systems; Management information systems; Natural languages; Object detection; Text mining;
Conference_Titel :
System Sciences (HICSS), 2010 43rd Hawaii International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
978-1-4244-5509-6
Electronic_ISBN :
1530-1605
DOI :
10.1109/HICSS.2010.48