DocumentCode :
2333362
Title :
A comparison of stemmers on source code identifiers for software search
Author :
Wiese, Andrew ; Ho, Valerie ; Hill, Emily
Author_Institution :
Dept. of Comput. Sci., Montclair State Univ., Montclair, NJ, USA
fYear :
2011
fDate :
25-30 Sept. 2011
Firstpage :
496
Lastpage :
499
Abstract :
As the popularity of text-based source code analysis grows, the use of stemmers to strip suffixes has increased. Stemmers have been used to more accurately determine relevance between a keyword query and methods in source code for search, exploration, and bug localization. In this paper, we investigate which traditional stemmers perform best on the domain of software, specifically, Java source code. We compare the stemmers using two case studies: a comparative analysis of the unified word classes in terms of accuracy and completeness, as well as an investigation into the effectiveness of stemming for software search. Our results indicate that relative stemmer effectiveness varies with a software engineering tool such as search, justifying further research into this area.
Keywords :
Java; program debugging; software tools; Java source code; bug localization; keyword query; software engineering tool; software search stemming; source code identifiers; suffix stripping; text-based source code analysis; Accuracy; Adders; Context; Humans; Java; Software; Software engineering; source code search; stemming; textual analysis of source code identifiers;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Software Maintenance (ICSM), 2011 27th IEEE International Conference on
Conference_Location :
Williamsburg, VI
ISSN :
1063-6773
Print_ISBN :
978-1-4577-0663-9
Electronic_ISBN :
1063-6773
Type :
conf
DOI :
10.1109/ICSM.2011.6080817
Filename :
6080817
Link To Document :
بازگشت