DocumentCode
1806028
Title
An Empirical Comparison of Four Text Mining Methods
Author
Lee, Sangno ; Baker, Jeff ; Song, Jaeki ; Wetherbe, James C.
Author_Institution
Texas Tech Univ., Lubbock, TX, USA
fYear
2010
fDate
5-8 Jan. 2010
Firstpage
1
Lastpage
10
Abstract
The amount of textual data that is available for researchers and businesses to analyze is increasing at a dramatic rate. This reality has led IS researchers to investigate various text mining techniques. This essay examines four text mining methods that are frequently used in order to identify their advantages and limitations. The four methods that we examine are (1) latent semantic analysis, (2) probabilistic latent semantic analysis, (3) latent Dirichlet allocation, and (4) the correlated topic model. We compare these four methods and highlight the optimal conditions under which to apply the various methods. Our paper sheds light on the theory that underlies text mining methods and provides guidance for researchers who seek to apply these methods.
Keywords
data mining; text analysis; IS researchers; correlated topic model; empirical comparison; latent Dirichlet allocation; latent semantic analysis; probabilistic latent semantic analysis; text mining methods; textual data; Advertising; Customer relationship management; Data analysis; Databases; Information systems; Management information systems; Natural languages; Object detection; Text mining;
fLanguage
English
Publisher
ieee
Conference_Titel
System Sciences (HICSS), 2010 43rd Hawaii International Conference on
Conference_Location
Honolulu, HI
ISSN
1530-1605
Print_ISBN
978-1-4244-5509-6
Electronic_ISBN
1530-1605
Type
conf
DOI
10.1109/HICSS.2010.48
Filename
5428645
Link To Document