DocumentCode
1662465
Title
Ontology-Based Feature Extraction
Author
Vicient, Carlos ; Sanchez, Dominick ; Moreno, Antonio
Author_Institution
Dept. d´´Eng. Inf. i Mat., Univ. Rovira i Virgili, Tarragona, Spain
Volume
3
fYear
2011
Firstpage
189
Lastpage
192
Abstract
Knowledge-based data mining and classification algorithms require of systems that are able to extract textual attributes contained in raw text documents, and map them to structured knowledge sources (e.g. ontologies) so that they can be semantically analyzed. The system presented in this paper performs this tasks in an automatic way, relying on a predefined ontology which states the concepts in this the posterior data analysis will be focused. As features, our system focuses on extracting relevant Named Entities from textual resources describing a particular entity. Those are evaluated by means of linguistic and Web-based co-occurrence analyses to map them to ontological concepts, thereby discovering relevant features of the object. The system has been preliminary tested with tourist destinations and Wikipedia textual resources, showing promising results.
Keywords
data analysis; data mining; information retrieval; knowledge based systems; ontologies (artificial intelligence); pattern classification; text analysis; Web-based co-occurrence analysis; Wikipedia textual resources; classification algorithms; data analysis; knowledge sources; knowledge-based data mining; linguistic analysis; named entities; ontology-based feature extraction; text documents; textual attributes extraction; tourist destinations; Data mining; Electronic publishing; Feature extraction; Information services; Internet; Ontologies; Semantics; Information Extraction; Linguistic Patterns; Ontologies; Web-based statistics;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Intelligence and Intelligent Agent Technology (WI-IAT), 2011 IEEE/WIC/ACM International Conference on
Conference_Location
Lyon
Print_ISBN
978-1-4577-1373-6
Electronic_ISBN
978-0-7695-4513-4
Type
conf
DOI
10.1109/WI-IAT.2011.199
Filename
6040837
Link To Document