DocumentCode :
1662465
Title :
Ontology-Based Feature Extraction
Author :
Vicient, Carlos ; Sanchez, Dominick ; Moreno, Antonio
Author_Institution :
Dept. d´´Eng. Inf. i Mat., Univ. Rovira i Virgili, Tarragona, Spain
Volume :
3
fYear :
2011
Firstpage :
189
Lastpage :
192
Abstract :
Knowledge-based data mining and classification algorithms require of systems that are able to extract textual attributes contained in raw text documents, and map them to structured knowledge sources (e.g. ontologies) so that they can be semantically analyzed. The system presented in this paper performs this tasks in an automatic way, relying on a predefined ontology which states the concepts in this the posterior data analysis will be focused. As features, our system focuses on extracting relevant Named Entities from textual resources describing a particular entity. Those are evaluated by means of linguistic and Web-based co-occurrence analyses to map them to ontological concepts, thereby discovering relevant features of the object. The system has been preliminary tested with tourist destinations and Wikipedia textual resources, showing promising results.
Keywords :
data analysis; data mining; information retrieval; knowledge based systems; ontologies (artificial intelligence); pattern classification; text analysis; Web-based co-occurrence analysis; Wikipedia textual resources; classification algorithms; data analysis; knowledge sources; knowledge-based data mining; linguistic analysis; named entities; ontology-based feature extraction; text documents; textual attributes extraction; tourist destinations; Data mining; Electronic publishing; Feature extraction; Information services; Internet; Ontologies; Semantics; Information Extraction; Linguistic Patterns; Ontologies; Web-based statistics;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Intelligence and Intelligent Agent Technology (WI-IAT), 2011 IEEE/WIC/ACM International Conference on
Conference_Location :
Lyon
Print_ISBN :
978-1-4577-1373-6
Electronic_ISBN :
978-0-7695-4513-4
Type :
conf
DOI :
10.1109/WI-IAT.2011.199
Filename :
6040837
Link To Document :
بازگشت