Title :
Chinese Text Categorization study based on feature weight learning
Author :
Zhan, Yan ; Chen, Hao ; Zhang, Su-fang ; Zheng, Mei
Author_Institution :
Key Lab. of Machine Learning & Comput. Intell., Hebei Univ., Baoding, China
Abstract :
Text categorization (TC) is an important component in many information organization and information management tasks. Two key issues in TC are feature coding and classifier design. The Euclidean distance is usually chosen as the similarity measure in K-nearest neighbor classification algorithm. All the features of each vector have different functions in describing samples. So we can decide different function of every feature by using feature weight learning. In this paper text categorization via K-nearest neighbor algorithm based on feature weight learning is described. The numerical experiments prove the validity of this learning algorithm.
Keywords :
data mining; encoding; feature extraction; learning (artificial intelligence); natural language processing; pattern classification; support vector machines; text analysis; Chinese text categorization; Euclidean distance; K-nearest neighbor classification algorithm; SVM; classifier design; data mining; feature coding; feature weight learning algorithm; information management; information organization; similarity measure; support vector machine; Classification tree analysis; Decision trees; Euclidean distance; Machine learning; Machine learning algorithms; Nearest neighbor searches; Support vector machine classification; Support vector machines; Testing; Text categorization; Feature weight; K-NN; Text Categorization;
Conference_Titel :
Machine Learning and Cybernetics, 2009 International Conference on
Conference_Location :
Baoding
Print_ISBN :
978-1-4244-3702-3
Electronic_ISBN :
978-1-4244-3703-0
DOI :
10.1109/ICMLC.2009.5212257