DocumentCode
3696617
Title
Compilation and evaluation of paraphrase representation list of compound verbs: Toward development of “Control language for action”
Author
Tomoya Shirai;Kyoko Kanzaki;Hirofumi Yabumoto;Hitoshi Isahara
Author_Institution
Department of Computer Science and Engineering, Toyohashi University of Technology, Toyohashi, Japan
fYear
2015
Firstpage
1
Lastpage
5
Abstract
In order to realize friendly man-machine communication, machines must understand not only surface expressions of human utterance but also deep meanings of human behavior. We started compilation of “paraphrase representation list of compound verbs” as the first step of investigation and standardization of lexical items which is a part of “control language for action”. We processed the corpus and vectorized the data by using Word2Vec. Using the created vector, we performed a calculation of similarity between the compound verbs and verbs in a corpus by cosine similarity, and created a paraphrase representation list. We got paraphrase expressions for 1899 compound verbs among 3289 compound verbs (including orthographic variants) stored in the compound verb lexicon. We found by this method words which do not exist in the Japanese WordNet. We investigated the words that exist only in the result of automatic extraction, and found that there are 213 unknown words and 227 new synonymous relationship. What is worthy of special mention is that there is 14 differences between the unknown word and a new synonymous relationship, which means we could find 14 words which are stored in the Japanese WordNet, but are not considered as synonyms of a word. We can say that the proposed method is useful for the expansion of paraphrase relationship listed by human intuitions.
Keywords
"Compounds","Data mining","Manuals","Man machine systems","Syntactics","Computer science","Thesauri"
Publisher
ieee
Conference_Titel
Advanced Informatics: Concepts, Theory and Applications (ICAICTA), 2015 2nd International Conference on
Print_ISBN
978-1-4673-8142-0
Type
conf
DOI
10.1109/ICAICTA.2015.7335351
Filename
7335351
Link To Document