Title :
Research on cross-language text similarity calculation
Author :
Sun Yuan;Zhao Qian
Author_Institution :
School of Information Engineering, Minzu University of China, Minority Languages Branch, National Language, Resource and Monitoring Research Center, Beijing, China
fDate :
5/1/2015 12:00:00 AM
Abstract :
Cross-language text similarity calculation is a critical and fundamental problem in natural language processing. It is widely used in cross-language research, such as cross-language information retrieval. In this paper, we used the LDA (Latent Dirichlet Allocation) model to calculate similarities of Tibetan and Chinese texts at the topic level. Through topic modelling and forecasting, the texts are mapped to the feature space of topics. This method reduced the dimensions of text space vector and improved the speed and efficiency of computation.
Keywords :
"Dictionaries","Computational modeling","Computational linguistics","Accuracy","Natural language processing","Internet"
Conference_Titel :
Electronics Information and Emergency Communication (ICEIEC), 2015 5th International Conference on
Print_ISBN :
978-1-4799-7283-8
DOI :
10.1109/ICEIEC.2015.7284573