• DocumentCode
    3691696
  • Title

    Heuristic-based part-of-speech tagging of source code identifiers and comments

  • Author

    Reem S. Alsuhaibani;Christian D. Newman;Michael L. Collard;Jonathan I. Maletic

  • Author_Institution
    Computer Science Kent State University Kent, OH, USA
  • fYear
    2015
  • fDate
    9/1/2015 12:00:00 AM
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    An approach for using heuristics and static program analysis information to markup part-of-speech for program identifiers is presented. It does not use a natural language part-ofspeech tagger for identifiers within the code. A set of heuristics is defined akin to natural language usage of identifiers usage in code. Additionally, method stereotype information, which is automatically derived, is used in the tagging process. The approach is built using the srcML infrastructure and adds part-of-speech information directly into the srcML markup.
  • Keywords
    "Speech","Object recognition","Tagging","Natural languages","Conferences","Software","Computational linguistics"
  • Publisher
    ieee
  • Conference_Titel
    Mining Unstructured Data (MUD), 2015 IEEE 5th Workshop on
  • Type

    conf

  • DOI
    10.1109/MUD.2015.7327960
  • Filename
    7327960