Title :
Document Classification Via TextCC Based on Stereographic Projection
Author :
Zhang, Zhen-ya ; Zhang, Shu-guang ; Wang, Xu-fa
Author_Institution :
Microsoft Key Lab. of Multimedia Comput. & Commun., Univ. of Sci. & Technol. of China, Hefei
Abstract :
TextCC can classify real documents instantly by cosine similarity. In this paper, stereographic projection is defined from n dimensional real space to the surface of the unit sphere in (n+1) dimensional space. This paper also proposes the relation between the Euclidean distance in n dimensional space and the cosine similarity in (n+1) dimensional real space. To classify documents with represented vectors normalized by stereographic projection, modification on the construction of the weight matrix of hidden layer of TextCC and the fundamental for those modifications are presented. With those modifications, TextCC can classify real documents instantly by Euclidean distance. Experimental results show that TextCC can classify real documents well by Euclidean distance based on stereographic projection
Keywords :
classification; learning (artificial intelligence); matrix algebra; text analysis; vectors; Euclidean distance; TextCC training; automatic text classification; cosine similarity; document classification; stereographic projection; vectors; weight matrix; Computer science; Cybernetics; Electronic mail; Euclidean distance; Feeds; Frequency; Laboratories; Machine learning; Multimedia computing; Neural networks; Text categorization; Vocabulary; Stereographic projection; TextCC; cosine similarity;
Conference_Titel :
Machine Learning and Cybernetics, 2006 International Conference on
Conference_Location :
Dalian, China
Print_ISBN :
1-4244-0061-9
DOI :
10.1109/ICMLC.2006.258706