Title :
Text Classification by Combining Grouping, LSA and kNN
Author :
Ishii, Naohiro ; Murai, Tsuyoshi ; Yamada, Takahiro ; Bao, Yongguang
Author_Institution :
Aichi Inst. of Technol.
Abstract :
A grouping method of the similar words is proposed for the classification of documents, which is applied to Reuters international news and it is shown that the grouping of words has equivalent ability to the latent semantic analysis (LSA) in the classification accuracy. Further, a new combining method is proposed for the documents classification, which consists of grouping, latent semantic analysis followed by the k-nearest neighbor classification (k-NN). The combining method proposed here, shows the higher accuracy in the classification than the conventional methods of the kNN, and the LSA followed by the kNN
Keywords :
pattern classification; text analysis; Reuters international news; document classification; k-nearest neighbor classification; latent semantic analysis; similar word grouping; text classification; Computational complexity; Computational efficiency; Conferences; Dictionaries; Frequency; Information science; Noise reduction; Pattern analysis; Pattern recognition; Text categorization;
Conference_Titel :
Computer and Information Science, 2006 and 2006 1st IEEE/ACIS International Workshop on Component-Based Software Engineering, Software Architecture and Reuse. ICIS-COMSAR 2006. 5th IEEE/ACIS International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
0-7695-2613-6
DOI :
10.1109/ICIS-COMSAR.2006.81