Title :
Document classification with spherical word vectors
Author :
Yiqiao Pan;Chao Xing;Dong Wang
Author_Institution :
Center for Speech and Language Technology (CSLT) Research Institute of Information Technology, Tsinghua University, Beijing, P.R. China
Abstract :
Recent research shows that low-dimensional continuous representations of words (word vectors) can be successfully employed to classify documents, and document vectors derived from semantic clustering work better than those derived from simple average pooling. On the other hand, our recent study demonstrated that embedding words on a hypersphere offers better performance on tasks including semantic relatedness and bilingual translation when compared to the original approach that embeds words in an unconstrained plane space. In this paper, spherical word vectors are applied to the document classification task. The experiments show that spherical word vectors can deliver good performance when combined with semantic clustering based on vMF distributions.
Keywords :
"Semantics","Training","Mixture models","Clustering methods","Syntactics","Mathematical model","Data models"
Conference_Titel :
Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2015 Asia-Pacific
DOI :
10.1109/APSIPA.2015.7415518