DocumentCode
2333362
Title
A comparison of stemmers on source code identifiers for software search
Author
Wiese, Andrew ; Ho, Valerie ; Hill, Emily
Author_Institution
Dept. of Comput. Sci., Montclair State Univ., Montclair, NJ, USA
fYear
2011
fDate
25-30 Sept. 2011
Firstpage
496
Lastpage
499
Abstract
As the popularity of text-based source code analysis grows, the use of stemmers to strip suffixes has increased. Stemmers have been used to more accurately determine relevance between a keyword query and methods in source code for search, exploration, and bug localization. In this paper, we investigate which traditional stemmers perform best on the domain of software, specifically, Java source code. We compare the stemmers using two case studies: a comparative analysis of the unified word classes in terms of accuracy and completeness, as well as an investigation into the effectiveness of stemming for software search. Our results indicate that relative stemmer effectiveness varies with a software engineering tool such as search, justifying further research into this area.
Keywords
Java; program debugging; software tools; Java source code; bug localization; keyword query; software engineering tool; software search stemming; source code identifiers; suffix stripping; text-based source code analysis; Accuracy; Adders; Context; Humans; Java; Software; Software engineering; source code search; stemming; textual analysis of source code identifiers;
fLanguage
English
Publisher
ieee
Conference_Titel
Software Maintenance (ICSM), 2011 27th IEEE International Conference on
Conference_Location
Williamsburg, VI
ISSN
1063-6773
Print_ISBN
978-1-4577-0663-9
Electronic_ISBN
1063-6773
Type
conf
DOI
10.1109/ICSM.2011.6080817
Filename
6080817
Link To Document