DocumentCode
2232010
Title
Text Classification by Combining Grouping, LSA and kNN
Author
Ishii, Naohiro ; Murai, Tsuyoshi ; Yamada, Takahiro ; Bao, Yongguang
Author_Institution
Aichi Inst. of Technol.
fYear
2006
fDate
10-12 July 2006
Firstpage
148
Lastpage
154
Abstract
A grouping method of the similar words is proposed for the classification of documents, which is applied to Reuters international news and it is shown that the grouping of words has equivalent ability to the latent semantic analysis (LSA) in the classification accuracy. Further, a new combining method is proposed for the documents classification, which consists of grouping, latent semantic analysis followed by the k-nearest neighbor classification (k-NN). The combining method proposed here, shows the higher accuracy in the classification than the conventional methods of the kNN, and the LSA followed by the kNN
Keywords
pattern classification; text analysis; Reuters international news; document classification; k-nearest neighbor classification; latent semantic analysis; similar word grouping; text classification; Computational complexity; Computational efficiency; Conferences; Dictionaries; Frequency; Information science; Noise reduction; Pattern analysis; Pattern recognition; Text categorization;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer and Information Science, 2006 and 2006 1st IEEE/ACIS International Workshop on Component-Based Software Engineering, Software Architecture and Reuse. ICIS-COMSAR 2006. 5th IEEE/ACIS International Conference on
Conference_Location
Honolulu, HI
Print_ISBN
0-7695-2613-6
Type
conf
DOI
10.1109/ICIS-COMSAR.2006.81
Filename
1651984
Link To Document