DocumentCode
2617928
Title
A multi-level hierarchical index structure for supporting efficient similarity search on tag sets
Author
Koh, Jia-Ling ; Shongwe, Nonhlanhla ; Cho, Chung-Wen
Author_Institution
Dept. of Comput. Sci. & Inf. Eng., Nat. Taiwan Normal Univ., Taipei, Taiwan
fYear
2012
fDate
16-18 May 2012
Firstpage
1
Lastpage
12
Abstract
Social communication websites has been an emerging type of a Web service that helps users to share their resources. For providing efficient similarity search of tag set in a social tagging system, we propose a multi-level hierarchical index structure to group similar tag sets. Not only the algorithms of similarity searches of tag sets, but also the algorithms of deletion and updating of tag sets by using the constructed index structure are provided. Furthermore, we define a modified hamming distance function on tag sets, which consider the semantically relatedness when comparing the members for evaluating the similarity of two tag sets. This function is more applicable to evaluate the similarity search of two tag sets. A systematic performance study is performed to verify the effectiveness and the efficiency of the proposed strategies. The experiment results show that the proposed MHIB approach further improves the pruning effect of the previous work which constructs a two-level index structure. Especially, the MHIB approach is well scalable with respect to the three parameters when using either the hamming distance or the modified hamming distance for similarity measure. Although the insertion operation of the MHIB approach requires higher cost than the naïve method, with the assistant of the constructed inverted list of clusters, it performs faster than the previous work. Besides, the cost of performing deletion operation by using the MHIB approach is much less than the other two approaches and so is the update operation.
Keywords
Web services; information retrieval; resource allocation; social networking (online); MHIB approach; Web service; efficient similarity search; modified hamming distance function; multilevel hierarchical index structure; resource sharing; similar tag set grouping; social communication Web sites; social tagging system; tag sets deletion; two-level index structure; Hamming distance; Indexing; Search problems; Tagging; Transaction databases; Upper bound; Social tagging; index structure; similarity search;
fLanguage
English
Publisher
ieee
Conference_Titel
Research Challenges in Information Science (RCIS), 2012 Sixth International Conference on
Conference_Location
Valencia
ISSN
2151-1349
Print_ISBN
978-1-4577-1936-3
Electronic_ISBN
2151-1349
Type
conf
DOI
10.1109/RCIS.2012.6240436
Filename
6240436
Link To Document