DocumentCode
731516
Title
Summarizing Complex Development Artifacts by Mining Heterogeneous Data
Author
Ponzanelli, Luca ; Mocci, Andrea ; Lanza, Michele
Author_Institution
REVEAL @ Fac. of Inf., Univ. of Lugano, Lugano, Switzerland
fYear
2015
fDate
16-17 May 2015
Firstpage
401
Lastpage
405
Abstract
Summarization is hailed as a promising approach to reduce the amount of information that must be taken in by the person who wants to understand development artifacts, such as pieces of code, bug reports, emails, etc. However, existing approaches treat artifacts as pure textual entities, disregarding the heterogeneous and partially structured nature of most artifacts, which contain intertwined pieces of distinct type, such as source code, diffs, stack traces, human language, etc. We present a novel approach to augment existing summarization techniques (such as LexRank) to deal with the heterogeneous and multidimensional nature of complex artifacts. Our preliminary results on heterogeneous artifacts suggest our approach outperforms the current text-based approaches.
Keywords
data mining; LexRank summarization techniques; complex development artifact summarization techniques; heterogeneous data mining; multidimensional complex artifact; text-based approaches; textual entity; Data mining; Electronic mail; Java; Natural languages; Software; Software engineering; XML; holistic; stack overfliow; summarization;
fLanguage
English
Publisher
ieee
Conference_Titel
Mining Software Repositories (MSR), 2015 IEEE/ACM 12th Working Conference on
Conference_Location
Florence
Type
conf
DOI
10.1109/MSR.2015.49
Filename
7180103
Link To Document