Annotating Indirect Anaphora for Hindi: A Corpus Based Study

Author

Singh, Pardeep ; Dutta, Kamlesh

Author_Institution

Comput. Sci. & Eng., Nat. Inst. of Technol., Hamirpur, India

fYear

2014

Firstpage

525

Lastpage

529

Abstract

Natural language processing requires a lot of analysis and information regarding words and segment of sentence. Almost all NLP applications such as machine translation, information extraction, automatic summarization, question answering system, natural language generation, etc., require successful identification and resolution of anaphora. Information regarding word using POS tagger, parser and other tool can be gathered. Hindi is language of free word order as compare to English. This enforces additional constraints on different NLP task. In this working paper we present an analysis of Hindi genre. We used ten tags from literature. Out of ten tags seven are annotated using Botley´s annotation scheme manually. We annotated 1540 demonstrative pronoun from twelve files of EMILEE corpus. Input file is EMILEE file and output is fully annotated unicode file.

Keywords

grammars; natural language processing; Botley annotation scheme; Hindi; NLP application; POS tagger; anaphora resolution; natural language processing; parser; Computational linguistics; Feature extraction; Pragmatics; Semantics; Support vector machines; Syntactics; Tagging; anaphora resolution; annotation; case marker; natural language processing; semantic category;

fLanguage

English

Publisher

ieee

Conference_Titel

Computational Intelligence and Communication Networks (CICN), 2014 International Conference on

Print_ISBN

978-1-4799-6928-9

Type

conf

DOI

10.1109/CICN.2014.120

Filename

7065540