Title :
Performing Text Categorization on Manifold
Author :
Wen, Guihua ; Chen, Gan ; Jiang, Lijun
Author_Institution :
South China Univ. of Technol., Guangzhou
Abstract :
Text categorization has become the key technology in organizing and processing the large amount of text information. It normally involves an extremely high dimensional space, which makes most existing approaches generate highly biased estimates so as to reduce the classification accuracy. These approaches do not consider that the text documents may be intrinsically located on the low-dimensional manifold. This paper presents an approach that performs text categorization on texts manifold with respect to the intrinsic global manifold structure, such as by geodesic distance to measure the distance between two texts. This approach has been applied to improve the KNN for text categorization. This is empirically validated by the conducted experiments.
Keywords :
text analysis; classification accuracy; geodesic distance; intrinsic global manifold structure; k-nearest neighbor; low-dimensional manifold; text categorization; text documents; text information; Cybernetics; Euclidean distance; Gallium nitride; Level measurement; Manifolds; Organizing; Performance evaluation; Space technology; Support vector machines; Text categorization;
Conference_Titel :
Systems, Man and Cybernetics, 2006. SMC '06. IEEE International Conference on
Conference_Location :
Taipei
Print_ISBN :
1-4244-0099-6
Electronic_ISBN :
1-4244-0100-3
DOI :
10.1109/ICSMC.2006.384735