DocumentCode :
570176
Title :
Evaluating and enhancing cross-domain rank predictability of textual entailment datasets
Author :
Lee, Cheng-Wei ; Lin, Chuan-Jie ; Shima, Hideki ; Hsu, Wen-Lian
Author_Institution :
Inst. of Inf. Sci., Acad. Sinica, Taipei, Taiwan
fYear :
2012
fDate :
8-10 Aug. 2012
Firstpage :
51
Lastpage :
58
Abstract :
Textual Entailment (TE) is the task of recognizing entailment, paraphrase, and contradiction relations between a given text pair. The goal of textual entailment research is to develop a core inference component that can be applied to various domains, such as IR or NLP. Since the domain that a TE system applies to may be different from its source domain, it is crucial to develop proper datasets for measuring the cross-domain ability of a TE system. We propose using Kendall´s tau to measure a dataset´s cross-domain rank predictability. Our analysis shows that incorporating “artificial pairs” into a dataset helps enhance its rank predictability. We also find that the completeness of guidelines has no obvious effect on the rank predictability of a dataset. To validate these findings, more investigation is needed; however these findings suggest some new directions for the creation of TE datasets in the future.
Keywords :
text analysis; Kendalls tau; TE; core inference component; enhancing cross domain rank predictability; textual entailment datasets; Accuracy; Correlation; Educational institutions; Guidelines; Humans; Standards; Text recognition; Cross-Domain Evaluation; RITE; Rank Predictability; Textual Entailment;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Reuse and Integration (IRI), 2012 IEEE 13th International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
978-1-4673-2282-9
Electronic_ISBN :
978-1-4673-2283-6
Type :
conf
DOI :
10.1109/IRI.2012.6302990
Filename :
6302990
Link To Document :
بازگشت