Title :
TaxoLearn: A Semantic Approach to Domain Taxonomy Learning
Author :
Dietz, E. ; Vandic, D. ; Frasincar, Flavius
Author_Institution :
Econometric Inst., Erasmus Univ., Rotterdam, Netherlands
Abstract :
Building domain taxonomies is a crucial task in the domain of ontology construction. Domain taxonomy learning keeps getting more important as a form of automatically obtaining a knowledge representation of a certain domain. The alternative of manually developing domain taxonomies is not trivial. The main issues encountered when manually developing a taxonomy are the non-availability of a domain knowledge expert and the considerable amount of effort needed for this task. This paper proposes Taxo Learn, an approach to automatic construction of domain taxonomies. Taxo Learn is a new methodology that combines aspects from existing approaches, but also contains new steps in order to improve the quality of the resulted domain taxonomy. The contribution of this paper is threefold. First, we employ a word sense disambiguation step when detecting concepts in the text. Second, we show the use of semantics-based hierarchical clustering for the purpose of taxonomy learning. Third, we propose a novel dynamic labeling procedure for the concept clusters. We evaluate our approach by comparing the machine generated taxonomy with a manually constructed golden taxonomy. Based on a corpus of documents in the field of financial economics, Taxo Learn shows a high precision for the learned taxonomic concept relationships.
Keywords :
financial management; learning (artificial intelligence); natural language processing; ontologies (artificial intelligence); pattern clustering; text analysis; TaxoLearn approach; automatic domain taxonomy construction; concept clusters; document corpus; domain knowledge expert nonavailability; domain taxonomy learning quality improvement; dynamic labeling procedure; financial economics; knowledge representation; machine generated taxonomy; manually constructed golden taxonomy; ontology construction domain; semantic-based hierarchical clustering; text concept detection; word sense disambiguation; concept learning; taxonomy learning; word sense disambiguation;
Conference_Titel :
Web Intelligence and Intelligent Agent Technology (WI-IAT), 2012 IEEE/WIC/ACM International Conferences on
Conference_Location :
Macau
Print_ISBN :
978-1-4673-6057-9
DOI :
10.1109/WI-IAT.2012.129