Title :
An Asymmetric Similarity Measure for Tag Clustering on Flickr
Author :
Huang, Xiaochen ; Zhou, Ying
Author_Institution :
Sch. of Inf. Technol., Univ. of Sydney, Sydney, NSW, Australia
Abstract :
Web 2.0 tools and environments have made tagging, the act of assigning keywords to on-line objects, a popular way to annotate shared resources. The success of now-prominent tagging systems makes tagging "the natural way for people to classify objects as well as an attractive way to discover new material". One of the most challenging problems is to harvest the semantics from these systems, which can support a number of applications, including tag clustering and tag recommendation. We conduct detailed studies on different types of tag relations and tag similarity measures, and propose a scalable measure that we name Reliability Factor Similarity Measure (RFSM). We compare it with two other measures having similar scalability by integrating them into hierarchical clustering methods and performing tag clustering on a subset of Flickr data. The results suggest that RFSM outperforms those two measures when it is applies for tag clustering purpose. We also present an alternative way of utilizing discovered tag relations to set up tag refining rules in order to deal with some noise in the initial tag sets, which can in turn improve the precision of tag relations.
Keywords :
Web sites; identification technology; pattern clustering; Flickr; Web 2.0 tools; asymmetric similarity measure; hierarchical clustering method; keyword assigning; reliability factor similarity measure; shared resource annotation; tag clustering; tag recommendation; tag relations; tag similarity measures; Australia; Clustering algorithms; Clustering methods; Conducting materials; Costs; Information technology; Performance evaluation; Scalability; Tagging; Vocabulary; Clustering; Folksonomy; Reliability Factor Similarity Measure; Tag; Web 2.0;
Conference_Titel :
Web Conference (APWEB), 2010 12th International Asia-Pacific
Conference_Location :
Busan
Print_ISBN :
978-1-7695-4012-2
Electronic_ISBN :
978-1-4244-6600-9
DOI :
10.1109/APWeb.2010.65