• DocumentCode
    587337
  • Title

    Extracting, identifiyng and visualisation of the content in software projects

  • Author

    Uhlar, M. ; Polasek, Ivan

  • Author_Institution
    Fac. of Inf. & Inf. Technol., Slovak Univ. of Technol. in Bratislava, Bratislava, Slovakia
  • fYear
    2012
  • fDate
    5-9 Nov. 2012
  • Firstpage
    72
  • Lastpage
    78
  • Abstract
    The paper proposes a method for extracting, identifying and visualisation of topics in software projects. In addition to standard information retrieval techniques, we use AST and WordNet ontology to enrich document vectors extracted from parsed source code, LSI to reduce its dimensionality and the swarm intelligence in the bee behaviour inspired algorithms to cluster documents contained in it. We extract topics from the identified clusters and visualise them in 3D graph. The goal is to provide insight into software projects for development participants in the process of analysing and reusing the source code.
  • Keywords
    data visualisation; graph theory; information retrieval; ontologies (artificial intelligence); software engineering; source coding; vectors; 3D graph; AST; LSI; WordNet ontology; content extraction; content identification; content visualisation; document vectors; information retrieval; parsed source code; software projects; Clustering algorithms; Indexes; Large scale integration; Software; Software algorithms; Vectors; Visualization; AST; Bee Behaviour Inspired Algorithms; Latent Semantic Indexing; Software Project; Source Code; Swarm Intelligence; Topic Identification and Extraction; Visualisation; WordNet Ontology;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Nature and Biologically Inspired Computing (NaBIC), 2012 Fourth World Congress on
  • Conference_Location
    Mexico City
  • Print_ISBN
    978-1-4673-4767-9
  • Type

    conf

  • DOI
    10.1109/NaBIC.2012.6402242
  • Filename
    6402242