DocumentCode :
169973
Title :
Using Dominance Chains to Detect Annotation Variants in Parsed Corpora
Author :
Faria, Pedro
Volume :
2
fYear :
2014
fDate :
20-24 Oct. 2014
Firstpage :
25
Lastpage :
32
Abstract :
In this paper, some results on the detection of variation in annotation in parsed corpora or tree banks are presented. Tree banks are generally built by means of using both automatic tools (i.e., taggers and parsers) and human intervention. In this process, inconsistencies (and, thus, variation) in the annotation arise, caused by a number of factors, for instance, disagreement in interpretation, incomplete or unclear annotation guidelines, etc. In this study, the algorithm for automatic detection of variation proposed in [1] is evaluated against the Tycho Brahe Corpus (TBC, [2]) and compared to an alternative implementation where variants of annotation are characterized by means of "dominance chains". Experimental results demonstrate that the modified version has better relative precision and recall than the original method.
Keywords :
computational linguistics; database management systems; TBC; Tycho Brahe Corpus; annotation guidelines; annotation variants; automatic tools; dominance chains; human intervention; parsed corpora; tree banks; Accuracy; Complexity theory; Guidelines; Natural languages; Pragmatics; Redundancy; Syntactics; computational linguistics; dominance chain; inconsistency detection; syntactic annotation; treebank;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
e-Science (e-Science), 2014 IEEE 10th International Conference on
Conference_Location :
Sao Paulo
Print_ISBN :
978-1-4799-4288-6
Type :
conf
DOI :
10.1109/eScience.2014.17
Filename :
6972092
Link To Document :
بازگشت