DocumentCode :
2617928
Title :
A multi-level hierarchical index structure for supporting efficient similarity search on tag sets
Author :
Koh, Jia-Ling ; Shongwe, Nonhlanhla ; Cho, Chung-Wen
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Taiwan Normal Univ., Taipei, Taiwan
fYear :
2012
fDate :
16-18 May 2012
Firstpage :
1
Lastpage :
12
Abstract :
Social communication websites has been an emerging type of a Web service that helps users to share their resources. For providing efficient similarity search of tag set in a social tagging system, we propose a multi-level hierarchical index structure to group similar tag sets. Not only the algorithms of similarity searches of tag sets, but also the algorithms of deletion and updating of tag sets by using the constructed index structure are provided. Furthermore, we define a modified hamming distance function on tag sets, which consider the semantically relatedness when comparing the members for evaluating the similarity of two tag sets. This function is more applicable to evaluate the similarity search of two tag sets. A systematic performance study is performed to verify the effectiveness and the efficiency of the proposed strategies. The experiment results show that the proposed MHIB approach further improves the pruning effect of the previous work which constructs a two-level index structure. Especially, the MHIB approach is well scalable with respect to the three parameters when using either the hamming distance or the modified hamming distance for similarity measure. Although the insertion operation of the MHIB approach requires higher cost than the naïve method, with the assistant of the constructed inverted list of clusters, it performs faster than the previous work. Besides, the cost of performing deletion operation by using the MHIB approach is much less than the other two approaches and so is the update operation.
Keywords :
Web services; information retrieval; resource allocation; social networking (online); MHIB approach; Web service; efficient similarity search; modified hamming distance function; multilevel hierarchical index structure; resource sharing; similar tag set grouping; social communication Web sites; social tagging system; tag sets deletion; two-level index structure; Hamming distance; Indexing; Search problems; Tagging; Transaction databases; Upper bound; Social tagging; index structure; similarity search;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Research Challenges in Information Science (RCIS), 2012 Sixth International Conference on
Conference_Location :
Valencia
ISSN :
2151-1349
Print_ISBN :
978-1-4577-1936-3
Electronic_ISBN :
2151-1349
Type :
conf
DOI :
10.1109/RCIS.2012.6240436
Filename :
6240436
Link To Document :
بازگشت