Title :
Extracting Concept Hierarchy Knowledge from the Web Based on Property Inheritance and Aggregation
Author :
Hattori, Shun ; Tanaka, Katsumi
Author_Institution :
Grad. Sch. of Inf., Kyoto Univ. Yoshida-Honmachi, Kyoto
Abstract :
Concept hierarchy knowledge, such as hyponymy and meronymy, is very important for various natural language processing systems. While WordNet and Wikipedia are being manually constructed and maintained as lexical ontologies, many researchers have tackled how to extract concept hierarchies from very large corpora of text documents such as the Web not manually but automatically. However, their methods are mostly based on lexico-syntactic patterns as not necessary but sufficient conditions of hyponymy and meronymy, so they can achieve high precision but low recall when using stricter patterns or they can achieve high recall but low precision when using looser patterns. Therefore, we need necessary conditions of hyponymy and meronymy to achieve high recall and not low precision. In this paper, not only "Property Inheritance\´\´ from a target concept to its hyponyms but also "Property Aggregation\´\´ from its hyponyms to the target concept is assumed to be necessary and sufficient conditions of hyponymy, and we propose a method to extract concept hierarchy knowledge from the Web based on property inheritance and property aggregation.
Keywords :
Internet; computational linguistics; knowledge acquisition; text analysis; vocabulary; Wikipedia; WordNet; World Wide Web; concept hierarchy knowledge extraction; hyponymy condition; lexical ontology; lexico-syntactic pattern; looser pattern; meronymy condition; natural language processing system; property aggregation; property inheritance; text document; very large corpora; Intelligent agent;
Conference_Titel :
Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-0-7695-3496-1
DOI :
10.1109/WIIAT.2008.394