DocumentCode :
639830
Title :
The role of artefact corpus in LSI-based traceability recovery
Author :
Bavota, Gabriele ; De Lucia, Andrea ; Oliveto, Rocco ; Panichella, A. ; Ricci, F. ; Tortora, Giuseppe
Author_Institution :
Univ. of Sannio, Benevento, Italy
fYear :
2013
fDate :
19-19 May 2013
Firstpage :
83
Lastpage :
89
Abstract :
Latent Semantic Indexing (LSI) is an advanced method widely and successfully employed in Information Retrieval (IR). It is an extension of Vector Space Model (VSM) and it is able to overcome VSM in canonical IR scenarios where it is used on very large document repositories. LSI has also been used to semi-automatically generate traceability links between software artefacts. However, in such a scenario LSI is not able to overcome VSM. This contradicting result is probably due to the different characteristics of software artefact repositories as compared to document repositories. In this paper we present a preliminary empirical study to analyze how the size and the vocabulary of the repository-in terms of number of documents and terms (i.e., the vocabulary)-affects the retrieval accuracy. Even if replications are needed to generalize our findings, the study presented in this paper provides some insights that might be used as guidelines for selecting the more adequate methods to be used for traceability recovery depending on the particular application context.
Keywords :
database indexing; document handling; information retrieval; program diagnostics; LSI-based traceability recovery; VSM; artefact corpus; canonical IR scenarios; document repositories; information retrieval; latent semantic indexing; semiautomatically traceability link generation; software artefact repositories; traceability recovery; vector space model; Accuracy; Indexing; Large scale integration; Software; Unified modeling language; Vectors; Vocabulary; Empirical Studies; Latent Semantic Indexing; Traceability recovery; Vector Space Model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Traceability in Emerging Forms of Software Engineering (TEFSE), 2013 International Workshop on
Conference_Location :
San Francisco, CA
Type :
conf
DOI :
10.1109/TEFSE.2013.6620160
Filename :
6620160
Link To Document :
بازگشت