• DocumentCode
    176048
  • Title

    Assessing MCR Discussion Usefulness Using Semantic Similarity

  • Author

    Pangsakulyanont, Thai ; Thongtanunam, Patanamon ; Port, Daniel ; Iida, Hiroyuki

  • Author_Institution
    Kasetsart Univ., Bangkok, Thailand
  • fYear
    2014
  • fDate
    12-13 Nov. 2014
  • Firstpage
    49
  • Lastpage
    54
  • Abstract
    Modern Code Review (MCR) is an informal practice whereby reviewers virtually discuss proposed changes by adding comments through a code review tool or mailing list. It has received much research attention due to its perceived cost-effectiveness and popularity with industrial and OSS projects. Recent studies indicate there is a positive relationship between the number of review comments and code quality. However, little research exists investigating how such discussion impacts software quality. The concern is that the informality of MCR encourages a focus on trivial, tangential, or unrelated issues. Indeed, we have observed that such comments are quite frequent and may even constitute the majority. We conjecture that an effective MCR actually depends on having a substantive quantity of comments that directly impact a proposed change (or are "useful"). To investigate this, a necessary first step requires distinguishing review comments that are useful to a proposed change from those that are not. For a large OSS projects such as our Qt case study, manual assessment of the over 72,000 comments is a daunting task. We propose to utilize semantic similarity as a practical, cost efficient, and empirically assurable approach for assisting with the manual usefulness assessment of MCR comments. Our case study results indicate that our approach can classify comments with an average F-measure score of 0.73 and reduce comment usefulness assessment effort by about 77%.
  • Keywords
    data mining; semantic networks; software quality; F-measure score; MCR comments; MCR discussion usefulness; OSS projects; code quality; code review tool; comment usefulness assessment; cost effectiveness; mailing list; modern code review; semantic similarity; software quality; Data models; Manuals; Semantics; Software quality; Standards; Training; Training data; Modern Code Review; Software Quality; Text Mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Empirical Software Engineering in Practice (IWESEP), 2014 6th International Workshop on
  • Conference_Location
    Osaka
  • Type

    conf

  • DOI
    10.1109/IWESEP.2014.11
  • Filename
    6976022