DocumentCode :
2704673
Title :
Using latent semantic analysis to identify similarities in source code to support program understanding
Author :
Maletic, Jonathan I. ; Marcus, Andrian
Author_Institution :
Div. of Comput. Sci., Memphis Univ., Memphis, TN, USA
fYear :
2000
fDate :
2000
Firstpage :
46
Lastpage :
53
Abstract :
The paper describes the results of applying Latent Semantic Analysis (LSA), an advanced information retrieval method, to program source code and associated documentation. Latent semantic analysis is a corpus based statistical method for inducing and representing aspects of the meanings of words and passages (of natural language) reflective in their usage. This methodology is assessed for application to the domain of software components (i.e., source code and its accompanying documentation). Here LSA is used as the basis to cluster software components. This clustering is used to assist in the understanding of a nontrivial software system, namely a version of Mosaic. Applying latent semantic analysis to the domain of source code and internal documentation for the support of program understanding is a new application of this method and a departure from the normal application domain of natural language
Keywords :
computational linguistics; information retrieval; natural languages; reverse engineering; statistical analysis; system documentation; LSA; Mosaic; corpus based statistical method; information retrieval method; internal documentation; latent semantic analysis; natural language; nontrivial software system; program understanding; software component clustering; software components; source code; source code similarities; Application software; Computer architecture; Computer science; Documentation; Information analysis; Information retrieval; Natural languages; Software maintenance; Software systems; Statistical analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Tools with Artificial Intelligence, 2000. ICTAI 2000. Proceedings. 12th IEEE International Conference on
Conference_Location :
Vancouver, BC
ISSN :
1082-3409
Print_ISBN :
0-7695-0909-6
Type :
conf
DOI :
10.1109/TAI.2000.889845
Filename :
889845
Link To Document :
بازگشت