DocumentCode :
1822664
Title :
Words Clustering Based on Keywords Indexing from Large-scale Categorization Corpora
Author :
Hua, Liu
Author_Institution :
Coll. of Chinese Language & Culture, Jinan Univ., Guangzhou, China
Volume :
1
fYear :
2009
fDate :
18-20 Aug. 2009
Firstpage :
407
Lastpage :
410
Abstract :
Keywords are indexed automatically for large-scale categorization corpora. Indexed keywords of more than 20 documents are selected as seed words, thus overcoming subjectivity of selecting seed words in clustering; at the same time, clustering is limited to particular category corpora and keywords indexed feature extraction method is adopted to obtain domanial words automatically, thus reducing noise of similarity calculation.
Keywords :
document handling; feature extraction; indexing; feature extraction; keywords indexing; large-scale categorization; words clustering; Feature extraction; Indexing; Information security; Large-scale systems; Materials science and technology; Noise reduction; Societies; Statistics; Vocabulary; Web pages; Categorization corpora; Clustering; Domanial words; Keywords indexing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Assurance and Security, 2009. IAS '09. Fifth International Conference on
Conference_Location :
Xian
Print_ISBN :
978-0-7695-3744-3
Type :
conf
DOI :
10.1109/IAS.2009.271
Filename :
5284129
Link To Document :
بازگشت