• DocumentCode
    731516
  • Title

    Summarizing Complex Development Artifacts by Mining Heterogeneous Data

  • Author

    Ponzanelli, Luca ; Mocci, Andrea ; Lanza, Michele

  • Author_Institution
    REVEAL @ Fac. of Inf., Univ. of Lugano, Lugano, Switzerland
  • fYear
    2015
  • fDate
    16-17 May 2015
  • Firstpage
    401
  • Lastpage
    405
  • Abstract
    Summarization is hailed as a promising approach to reduce the amount of information that must be taken in by the person who wants to understand development artifacts, such as pieces of code, bug reports, emails, etc. However, existing approaches treat artifacts as pure textual entities, disregarding the heterogeneous and partially structured nature of most artifacts, which contain intertwined pieces of distinct type, such as source code, diffs, stack traces, human language, etc. We present a novel approach to augment existing summarization techniques (such as LexRank) to deal with the heterogeneous and multidimensional nature of complex artifacts. Our preliminary results on heterogeneous artifacts suggest our approach outperforms the current text-based approaches.
  • Keywords
    data mining; LexRank summarization techniques; complex development artifact summarization techniques; heterogeneous data mining; multidimensional complex artifact; text-based approaches; textual entity; Data mining; Electronic mail; Java; Natural languages; Software; Software engineering; XML; holistic; stack overfliow; summarization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Mining Software Repositories (MSR), 2015 IEEE/ACM 12th Working Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/MSR.2015.49
  • Filename
    7180103