DocumentCode :
3235683
Title :
Automatic Derivation of Concepts Based on the Analysis of Source Code Identifiers
Author :
Guerrouj, Latifa
Author_Institution :
DGIGL - SOCCER Lab., Ecole Polytech. de Montreal, Montréal, QC, Canada
fYear :
2010
fDate :
13-16 Oct. 2010
Firstpage :
301
Lastpage :
304
Abstract :
The existing software engineering literature has empirically shown that a proper choice of identifiers influences software understandability and maintainability. Indeed, identifiers are developers´ main up-to-date source of information and guide their cognitive processes during program understanding when the high-level documentation is scarce or outdated and when the source code is not sufficiently commented. Deriving domain terms from identifiers using high-level and domain concepts is not an easy task when naming conventions (e.g., Camel Case) are not used or strictly followed and-or when these words have been abbreviated or otherwise transformed. Our thesis is to develop an approach that overcomes the shortcomings of the existing approaches and maps identifiers to domain concepts even in the absence of naming conventions and-or the presence of abbreviations. Our approach uses a thesaurus of words and abbreviations to map terms or transformed words composing identifiers to dictionary words. It relies on an oracle that we manually build for the validation of our results. To evaluate our technique, we apply it to derive concepts from identifiers of different systems and open source projects. We also enrich it by the use of domain knowledge and context-aware dictionaries to analyze how sensitive are its performances to the use of contextual information and specialized knowledge.
Keywords :
software maintenance; system documentation; automatic derivation; cognitive processes; domain concepts; high-level documentation; maps identifiers; naming conventions; program understanding; software engineering; software maintainability; software understandability; source code identifiers; Buildings; Conferences; Dictionaries; Presses; Software; Speech recognition; Thesauri; Identifier Splitting; Linguistic Analysis; Program Comprehension; Software Quality;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Reverse Engineering (WCRE), 2010 17th Working Conference on
Conference_Location :
Beverly, MA
ISSN :
1095-1350
Print_ISBN :
978-1-4244-8911-4
Type :
conf
DOI :
10.1109/WCRE.2010.45
Filename :
5645490
Link To Document :
بازگشت