DocumentCode :
731516
Title :
Summarizing Complex Development Artifacts by Mining Heterogeneous Data
Author :
Ponzanelli, Luca ; Mocci, Andrea ; Lanza, Michele
Author_Institution :
REVEAL @ Fac. of Inf., Univ. of Lugano, Lugano, Switzerland
fYear :
2015
fDate :
16-17 May 2015
Firstpage :
401
Lastpage :
405
Abstract :
Summarization is hailed as a promising approach to reduce the amount of information that must be taken in by the person who wants to understand development artifacts, such as pieces of code, bug reports, emails, etc. However, existing approaches treat artifacts as pure textual entities, disregarding the heterogeneous and partially structured nature of most artifacts, which contain intertwined pieces of distinct type, such as source code, diffs, stack traces, human language, etc. We present a novel approach to augment existing summarization techniques (such as LexRank) to deal with the heterogeneous and multidimensional nature of complex artifacts. Our preliminary results on heterogeneous artifacts suggest our approach outperforms the current text-based approaches.
Keywords :
data mining; LexRank summarization techniques; complex development artifact summarization techniques; heterogeneous data mining; multidimensional complex artifact; text-based approaches; textual entity; Data mining; Electronic mail; Java; Natural languages; Software; Software engineering; XML; holistic; stack overfliow; summarization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Mining Software Repositories (MSR), 2015 IEEE/ACM 12th Working Conference on
Conference_Location :
Florence
Type :
conf
DOI :
10.1109/MSR.2015.49
Filename :
7180103
Link To Document :
بازگشت