Title :
Inferring semantically related software terms and their taxonomy by leveraging collaborative tagging
Author :
Shaowei Wang ; Lo, Daniel ; Lingxiao Jiang
Author_Institution :
Sch. of Inf. Syst., Singapore Manage. Univ., Singapore, Singapore
Abstract :
Many software engineering tasks, such as feature location and duplicate bug report detection, leverages similarities among textual corpora. However, due to the different words used by developers to express the same concept, exact matching of words is insufficient. One document can contain a particular word while the other document may contain another word that is semantically related but is not the same. Such word differences may cause inaccuracies in subsequent software engineering tasks. Recently, tagging has impacted the software engineering community. Developers increasingly use tags to describe important features of a software product. Many project hosting sites allow users to tag various projects with their own words. It becomes increasingly important to understand and relate these tags. Based on the tags available from software project hosting websites, we propose a similarity metric to infer semantically related terms, each of which is a tag, and build a taxonomy that could further describe the relationships among these terms. We have built a sample taxonomy from tens of thousands of projects and their tags. Our user studies show that our proposed similarity metric for tags are indeed related to the semantic similarity of the terms, and the resultant semantic taxonomy among terms is reasonably good.
Keywords :
Web sites; project management; software engineering; software management; collaborative tagging; duplicate bug report detection; feature location; semantic similarity; semantic taxonomy; semantically related software terms; similarity metric; software developers; software engineering community; software engineering tasks; software product; software project hosting Web sites; textual corpora; word matching; Clustering algorithms; Measurement; Semantics; Software; Software engineering; Tagging; Taxonomy;
Conference_Titel :
Software Maintenance (ICSM), 2012 28th IEEE International Conference on
Conference_Location :
Trento
Print_ISBN :
978-1-4673-2313-0
DOI :
10.1109/ICSM.2012.6405332