Title of article :
A semantic similarity approach to predicting Library of Congress subject headings for social tags
Author/Authors :
Kwan Yi، نويسنده ,
Issue Information :
ماهنامه با شماره پیاپی سال 2010
Pages :
15
From page :
1658
To page :
1672
Abstract :
Social tagging or collaborative tagging has become a new trend in the organization, management, and discovery of digital information. The rapid growth of shared information mostly controlled by social tags poses a new challenge for social tag-based information organization and retrieval. A plausible approach for this challenge is linking social tags to a controlled vocabulary. As an introductory step for this approach, this study investigates ways of predicting relevant subject headings for resources from social tags assigned to the resources. The prediction of subject headings was measured by five different similarity measures: tf–idf, cosine-based similarity (CoS), Jaccard similarity (or Jaccard coefficient; JS), Mutual information (MI), and information radius (IRad). Their results were compared to those by professionals. The results show that a CoS measure based on top five social tags was most effective. Inclusions of more social tags only aggravate the performance. The performance of JS is comparable to the performance of CoS while tf–idf is comparable with up to 70% less than the best performance. MI and IRad have inferior performance compared to the other methods. This study demonstrates the application of the similarity measuring techniques to the prediction of correct Library of Congress subject headings.
Journal title :
Journal of the American Society for Information Science and Technology
Serial Year :
2010
Journal title :
Journal of the American Society for Information Science and Technology
Record number :
994283
Link To Document :
بازگشت