• DocumentCode
    2617928
  • Title

    A multi-level hierarchical index structure for supporting efficient similarity search on tag sets

  • Author

    Koh, Jia-Ling ; Shongwe, Nonhlanhla ; Cho, Chung-Wen

  • Author_Institution
    Dept. of Comput. Sci. & Inf. Eng., Nat. Taiwan Normal Univ., Taipei, Taiwan
  • fYear
    2012
  • fDate
    16-18 May 2012
  • Firstpage
    1
  • Lastpage
    12
  • Abstract
    Social communication websites has been an emerging type of a Web service that helps users to share their resources. For providing efficient similarity search of tag set in a social tagging system, we propose a multi-level hierarchical index structure to group similar tag sets. Not only the algorithms of similarity searches of tag sets, but also the algorithms of deletion and updating of tag sets by using the constructed index structure are provided. Furthermore, we define a modified hamming distance function on tag sets, which consider the semantically relatedness when comparing the members for evaluating the similarity of two tag sets. This function is more applicable to evaluate the similarity search of two tag sets. A systematic performance study is performed to verify the effectiveness and the efficiency of the proposed strategies. The experiment results show that the proposed MHIB approach further improves the pruning effect of the previous work which constructs a two-level index structure. Especially, the MHIB approach is well scalable with respect to the three parameters when using either the hamming distance or the modified hamming distance for similarity measure. Although the insertion operation of the MHIB approach requires higher cost than the naïve method, with the assistant of the constructed inverted list of clusters, it performs faster than the previous work. Besides, the cost of performing deletion operation by using the MHIB approach is much less than the other two approaches and so is the update operation.
  • Keywords
    Web services; information retrieval; resource allocation; social networking (online); MHIB approach; Web service; efficient similarity search; modified hamming distance function; multilevel hierarchical index structure; resource sharing; similar tag set grouping; social communication Web sites; social tagging system; tag sets deletion; two-level index structure; Hamming distance; Indexing; Search problems; Tagging; Transaction databases; Upper bound; Social tagging; index structure; similarity search;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Research Challenges in Information Science (RCIS), 2012 Sixth International Conference on
  • Conference_Location
    Valencia
  • ISSN
    2151-1349
  • Print_ISBN
    978-1-4577-1936-3
  • Electronic_ISBN
    2151-1349
  • Type

    conf

  • DOI
    10.1109/RCIS.2012.6240436
  • Filename
    6240436