A text clustering algorithm based on simplified cluster hypothesis

Author

Sun Yuan ; Guo Wenbin

Author_Institution

Sch. of Inf. Eng., Minzu Univ. of China, Beijing, China

fYear

2013

fDate

23-24 Dec. 2013

Firstpage

412

Lastpage

415

Abstract

How to quickly and efficiently determine the subject category from a large amount of text is becoming an important challenge in text clustering. In this paper, One-Next text clustering algorithm based on the simplified cluster hypothesis is proposed. Meanwhile, a feature vector optimization method using grading feature vector extraction method is designed. Finally, the experimental results show that this method can get a high precession and F value, and the algorithm complexity is lower than other text clustering methods.

Keywords

computational complexity; feature extraction; optimisation; pattern clustering; text analysis; vectors; algorithm complexity; feature vector optimization method; grading feature vector extraction method; one-next text clustering algorithm; simplified cluster hypothesis; Algorithm design and analysis; Classification algorithms; Clustering algorithms; Clustering methods; Feature extraction; Time complexity; Vectors; VSM; feature vector optimization; text clustering; text similarity;

fLanguage

English

Publisher

ieee

Conference_Titel

Instrumentation and Measurement, Sensor Network and Automation (IMSNA), 2013 2nd International Symposium on

Conference_Location

Toronto, ON

Type

conf

DOI

10.1109/IMSNA.2013.6743303

Filename

6743303

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=3324415