• DocumentCode
    2518221
  • Title

    A Proposal for a Semantic Intelligent Document Repository Architecture

  • Author

    Rodríguez, Alejandro ; Colomo, Ricardo ; Gómez, Juna Miguel ; Alor-Hernandez, Giner ; Posada-Gomez, Ruben ; Juarez-Martinez, Ulises ; Gayo, Jose Emilio Labro ; Vidyasankar, Krishnamurthy

  • Author_Institution
    Comput. Sci. Dept., Univ. Carlos III de Madrid, Leganes, Spain
  • fYear
    2009
  • fDate
    22-25 Sept. 2009
  • Firstpage
    69
  • Lastpage
    75
  • Abstract
    The processing of high amount of documents is a highly complex challenge, which becomes even more complicated when the goal is to extract the semantically relevant data within the documents. The large-scale processing of immense repositories of knowledge requires techniques which perform information extraction to facilitate the subsequent classification and indexing of texts. Having this into account, we propose the use of Dublin Core metadata for the classification of Software Engineering publications. Based on the information obtained from Dublin Core, we present a global repository that is populated automatically, which takes the form of an ontology which represents the distinct areas of Software Engineering knowledge inspired by SWEBOK (Software Engineering Body of Knowledge). Finally, the process of the classification of texts within the ontology is carried out in three steps: keyword analysis, processing of the document. We believe our proposal based on a linguistic text classification method, heuristics, and subsequently the intersection of the three techniques mentioned, generating more precise search results in response to user queries.
  • Keywords
    database indexing; ontologies (artificial intelligence); software engineering; text analysis; Dublin core metadata; document processing; global repository; information extraction; keyword analysis; linguistic text classification method; ontology; semantic intelligent document repository architecture; software engineering knowledge; software engineering publication; text indexing; Computer science; Data mining; Information retrieval; Intelligent robots; Internet; Ontologies; Proposals; Software engineering; Support vector machine classification; Support vector machines; Ontologies; semantic Web.;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Electronics, Robotics and Automotive Mechanics Conference, 2009. CERMA '09.
  • Conference_Location
    Cuernavaca, Morelos
  • Print_ISBN
    978-0-7695-3799-3
  • Type

    conf

  • DOI
    10.1109/CERMA.2009.26
  • Filename
    5342009