Title :
A comparison of stemmers on source code identifiers for software search
Author :
Wiese, Andrew ; Ho, Valerie ; Hill, Emily
Author_Institution :
Dept. of Comput. Sci., Montclair State Univ., Montclair, NJ, USA
Abstract :
As the popularity of text-based source code analysis grows, the use of stemmers to strip suffixes has increased. Stemmers have been used to more accurately determine relevance between a keyword query and methods in source code for search, exploration, and bug localization. In this paper, we investigate which traditional stemmers perform best on the domain of software, specifically, Java source code. We compare the stemmers using two case studies: a comparative analysis of the unified word classes in terms of accuracy and completeness, as well as an investigation into the effectiveness of stemming for software search. Our results indicate that relative stemmer effectiveness varies with a software engineering tool such as search, justifying further research into this area.
Keywords :
Java; program debugging; software tools; Java source code; bug localization; keyword query; software engineering tool; software search stemming; source code identifiers; suffix stripping; text-based source code analysis; Accuracy; Adders; Context; Humans; Java; Software; Software engineering; source code search; stemming; textual analysis of source code identifiers;
Conference_Titel :
Software Maintenance (ICSM), 2011 27th IEEE International Conference on
Conference_Location :
Williamsburg, VI
Print_ISBN :
978-1-4577-0663-9
Electronic_ISBN :
1063-6773
DOI :
10.1109/ICSM.2011.6080817