DocumentCode
653731
Title
Linking book characters toward a corpus encoding relations between entities
Author
Cristea, D. ; Ignat, Eugen
Author_Institution
Dept. of Comput. Sci., Alexandru Ioan Cuza Univ. of Iasi, Iasi, Romania
fYear
2013
fDate
16-19 Oct. 2013
Firstpage
1
Lastpage
8
Abstract
What does a novel bring to a reader? What can it bring to a machine? Are there chances that a machine will decipher the messages a book expresses in free language? Part of the content of a text is encoded in relations between entities. In order to decode them, algorithms make use of learning techniques in which the training is guided by corpora that make explicit entities and relations. The creation of a gold corpus to be used in training and evaluation is therefore of a primary concern. This paper proposes annotation conventions and methodological prerequisites for the creation of a corpus that puts in evidence characters in a book and relations that are mentioned as holding between them, of the types: anaphoric, affective, kinship and social. The language under investigation is Romanian and the type of text used is fiction, but the proposed conventions are thought to be applicable to any language and type of text.
Keywords
learning (artificial intelligence); literature; natural language processing; text analysis; Romanian language; affective type; anaphoric type; annotation convention; book character linking; corpus creation; corpus encoding relations; evidence characters; fiction text; kinship type; learning technique; methodological prerequisites; social type; Gold; Joining processes; Knowledge based systems; Semantics; Syntactics; Training; XML; XML; anaphoric relations; annotated corpora; annotation conventions; content analysis; entity linking; semantic relations; text analytics; text understanding;
fLanguage
English
Publisher
ieee
Conference_Titel
Speech Technology and Human - Computer Dialogue (SpeD), 2013 7th Conference on
Conference_Location
Cluj-Napoca
Type
conf
DOI
10.1109/SpeD.2013.6682658
Filename
6682658
Link To Document