DocumentCode :
3253109
Title :
Information retrieval using Hellinger distance and sqrt-cos similarity
Author :
Shunzhi Zhu ; Lizhao Liu ; Yan Wang
Author_Institution :
Dept. of Comput. Sci. & Technol., Xiamen Univ. of Technol., Xiamen, China
fYear :
2012
fDate :
14-17 July 2012
Firstpage :
925
Lastpage :
929
Abstract :
In this paper, we propose a similarity measurement method based on the Hellinger distance and square-root cosine. Then use Hellinger distance as the distance metric for document clustering and a new square-root cosine similarity for query information retrieval. This new similarity/distance also bridges between traditional tf_idf weighting to binary weighting in vector space model. Finally, we conduct a comparison on performance between this method and the one based on Euclidean distance and cosine similarity. And from the results, we clearly observe that the precision and recall are improved by using the sqrt-cos similarity.
Keywords :
document handling; query processing; vectors; Euclidean distance; binary weighting; distance metric; document clustering; information Hellinger distance; information retrieval; precision improvement; query information retrieval; recall improvement; square-root cosine similarity measurement method; tf-idf weighting; vector space model; Computational modeling; Educational institutions; Euclidean distance; Information retrieval; Probabilistic logic; Vectors; Hellinger cosine measurement; Hellinger distance; document clustering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Science & Education (ICCSE), 2012 7th International Conference on
Conference_Location :
Melbourne, VIC
Print_ISBN :
978-1-4673-0241-8
Type :
conf
DOI :
10.1109/ICCSE.2012.6295217
Filename :
6295217
Link To Document :
بازگشت