DocumentCode :
1618590
Title :
Large scale similarity-based relation expansion
Author :
Tsuchidal, Masaaki ; De Saeger, Stijn ; Torisawa, Kentaro ; Murata, Masaki ; Kazama, Jun´ichi ; Kuroda, Kow ; Ohwada, Hayato
Author_Institution :
Language Infrastruct. Group, Nat. Inst. of Inf. & Commun. Technol., Kyoto, Japan
fYear :
2010
Firstpage :
141
Lastpage :
148
Abstract :
Recent advances in automatic knowledge acquisition methods make it possible to construct massive knowledge bases of semantic relations, containing information potentially unknown to their users. However for certain data mining tasks like finding potential causes of a disease or side-effects of a drug, where missing a small piece of information can have grave consequences, the coverage of automatically acquired knowledge bases is often insufficient. This paper explores the use of automatic hypothesis generation for expanding a knowledge base of semantic relations, using distributional word similarities obtained from a large Web corpus. If successful, such a method can drastically improve the coverage of automatically acquired semantic relations, at the expense of a slight reduction in accuracy. We show that large scale similarity-based relation expansion works quite well for this purpose. Using a 100 million Japanese Web page corpus as input, we could generate a substantial amount of new semantic relations that were not found in the input corpus but whose validity was confirmed in a much larger Web corpus, i.e., by using a commercial Web search engine.
Keywords :
Internet; knowledge acquisition; Japanese Web page corpus; Web corpus; automatic hypothesis generation; automatic knowledge acquisition methods; commercial Web search engine; large scale similarity based relation expansion; massive knowledge bases; semantic relations; Cleaning; Copper; Filtering; Knowledge engineering; Materials; Semantics; Web pages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Universal Communication Symposium (IUCS), 2010 4th International
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-7821-7
Type :
conf
DOI :
10.1109/IUCS.2010.5666758
Filename :
5666758
Link To Document :
بازگشت