• DocumentCode
    2333362
  • Title

    A comparison of stemmers on source code identifiers for software search

  • Author

    Wiese, Andrew ; Ho, Valerie ; Hill, Emily

  • Author_Institution
    Dept. of Comput. Sci., Montclair State Univ., Montclair, NJ, USA
  • fYear
    2011
  • fDate
    25-30 Sept. 2011
  • Firstpage
    496
  • Lastpage
    499
  • Abstract
    As the popularity of text-based source code analysis grows, the use of stemmers to strip suffixes has increased. Stemmers have been used to more accurately determine relevance between a keyword query and methods in source code for search, exploration, and bug localization. In this paper, we investigate which traditional stemmers perform best on the domain of software, specifically, Java source code. We compare the stemmers using two case studies: a comparative analysis of the unified word classes in terms of accuracy and completeness, as well as an investigation into the effectiveness of stemming for software search. Our results indicate that relative stemmer effectiveness varies with a software engineering tool such as search, justifying further research into this area.
  • Keywords
    Java; program debugging; software tools; Java source code; bug localization; keyword query; software engineering tool; software search stemming; source code identifiers; suffix stripping; text-based source code analysis; Accuracy; Adders; Context; Humans; Java; Software; Software engineering; source code search; stemming; textual analysis of source code identifiers;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Maintenance (ICSM), 2011 27th IEEE International Conference on
  • Conference_Location
    Williamsburg, VI
  • ISSN
    1063-6773
  • Print_ISBN
    978-1-4577-0663-9
  • Electronic_ISBN
    1063-6773
  • Type

    conf

  • DOI
    10.1109/ICSM.2011.6080817
  • Filename
    6080817