• DocumentCode
    588617
  • Title

    Inferring semantically related software terms and their taxonomy by leveraging collaborative tagging

  • Author

    Shaowei Wang ; Lo, Daniel ; Lingxiao Jiang

  • Author_Institution
    Sch. of Inf. Syst., Singapore Manage. Univ., Singapore, Singapore
  • fYear
    2012
  • fDate
    23-28 Sept. 2012
  • Firstpage
    604
  • Lastpage
    607
  • Abstract
    Many software engineering tasks, such as feature location and duplicate bug report detection, leverages similarities among textual corpora. However, due to the different words used by developers to express the same concept, exact matching of words is insufficient. One document can contain a particular word while the other document may contain another word that is semantically related but is not the same. Such word differences may cause inaccuracies in subsequent software engineering tasks. Recently, tagging has impacted the software engineering community. Developers increasingly use tags to describe important features of a software product. Many project hosting sites allow users to tag various projects with their own words. It becomes increasingly important to understand and relate these tags. Based on the tags available from software project hosting websites, we propose a similarity metric to infer semantically related terms, each of which is a tag, and build a taxonomy that could further describe the relationships among these terms. We have built a sample taxonomy from tens of thousands of projects and their tags. Our user studies show that our proposed similarity metric for tags are indeed related to the semantic similarity of the terms, and the resultant semantic taxonomy among terms is reasonably good.
  • Keywords
    Web sites; project management; software engineering; software management; collaborative tagging; duplicate bug report detection; feature location; semantic similarity; semantic taxonomy; semantically related software terms; similarity metric; software developers; software engineering community; software engineering tasks; software product; software project hosting Web sites; textual corpora; word matching; Clustering algorithms; Measurement; Semantics; Software; Software engineering; Tagging; Taxonomy;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Maintenance (ICSM), 2012 28th IEEE International Conference on
  • Conference_Location
    Trento
  • ISSN
    1063-6773
  • Print_ISBN
    978-1-4673-2313-0
  • Type

    conf

  • DOI
    10.1109/ICSM.2012.6405332
  • Filename
    6405332