DocumentCode
2518221
Title
A Proposal for a Semantic Intelligent Document Repository Architecture
Author
Rodríguez, Alejandro ; Colomo, Ricardo ; Gómez, Juna Miguel ; Alor-Hernandez, Giner ; Posada-Gomez, Ruben ; Juarez-Martinez, Ulises ; Gayo, Jose Emilio Labro ; Vidyasankar, Krishnamurthy
Author_Institution
Comput. Sci. Dept., Univ. Carlos III de Madrid, Leganes, Spain
fYear
2009
fDate
22-25 Sept. 2009
Firstpage
69
Lastpage
75
Abstract
The processing of high amount of documents is a highly complex challenge, which becomes even more complicated when the goal is to extract the semantically relevant data within the documents. The large-scale processing of immense repositories of knowledge requires techniques which perform information extraction to facilitate the subsequent classification and indexing of texts. Having this into account, we propose the use of Dublin Core metadata for the classification of Software Engineering publications. Based on the information obtained from Dublin Core, we present a global repository that is populated automatically, which takes the form of an ontology which represents the distinct areas of Software Engineering knowledge inspired by SWEBOK (Software Engineering Body of Knowledge). Finally, the process of the classification of texts within the ontology is carried out in three steps: keyword analysis, processing of the document. We believe our proposal based on a linguistic text classification method, heuristics, and subsequently the intersection of the three techniques mentioned, generating more precise search results in response to user queries.
Keywords
database indexing; ontologies (artificial intelligence); software engineering; text analysis; Dublin core metadata; document processing; global repository; information extraction; keyword analysis; linguistic text classification method; ontology; semantic intelligent document repository architecture; software engineering knowledge; software engineering publication; text indexing; Computer science; Data mining; Information retrieval; Intelligent robots; Internet; Ontologies; Proposals; Software engineering; Support vector machine classification; Support vector machines; Ontologies; semantic Web.;
fLanguage
English
Publisher
ieee
Conference_Titel
Electronics, Robotics and Automotive Mechanics Conference, 2009. CERMA '09.
Conference_Location
Cuernavaca, Morelos
Print_ISBN
978-0-7695-3799-3
Type
conf
DOI
10.1109/CERMA.2009.26
Filename
5342009
Link To Document