• DocumentCode
    2029528
  • Title

    A multi-block scheme for searching source codes

  • Author

    Hsu, Sheng-Kuei ; Lin, Shi-Jen

  • Author_Institution
    Dept. of Inf. Manage., Nanya Inst. of Technol., Jhongli, Taiwan
  • fYear
    2010
  • fDate
    16-18 Dec. 2010
  • Firstpage
    608
  • Lastpage
    613
  • Abstract
    The large amounts of software source code projects available on the Internet or within companies are creating new information retrieval challenges. Present-day source code search engines, such as Google Code Search, tend to treat source code as pure text, as they do with web pages. However, source code files differ from web pages or pure text files in that each file may contain certain blocks expressing different functions. Developers use API-statements to complete functions within a source file and use comments to indicate the meaning of each function. In this paper, we segment source code files into two types of block, code-data block and metadata block, that possess different stemming and stop word filtering processes used in building the source code index. Finally, we propose an auto block-specified query processing approach to assist users in searching for specific code block within a source file. Experimental results indicate that our approach provides a more flexible source code search mechanism that allows a greater number of relevant items to be found.
  • Keywords
    Internet; information filtering; meta data; search engines; software engineering; text analysis; API-statements; Google Code Search; Internet; auto block-specified query processing; code-data block; information retrieval; metadata block; multiblock scheme; software source code searching; source code files; source code search engine; word filtering; Google; Indexing; Search engines; Software; Web pages; code retrieval; code search; multi-block scheme; vector space model;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Symposium (ICS), 2010 International
  • Conference_Location
    Tainan
  • Print_ISBN
    978-1-4244-7639-8
  • Type

    conf

  • DOI
    10.1109/COMPSYM.2010.5685441
  • Filename
    5685441