DocumentCode
1910613
Title
Anaphora Resolution in Hindi Documents
Author
Agarwal, Sachin ; Srivastava, Manaj ; Agarwal, Pallavi ; Sanyal, Ratna
Author_Institution
Indian Inst. of Inf. Technol., Allahabad
fYear
2007
fDate
Aug. 30 2007-Sept. 1 2007
Firstpage
452
Lastpage
458
Abstract
This paper presents anaphora resolution as a technique of semantic analysis of text documents written in Hindi language. The focus is on texts that mainly employ simple sentences, such as children´s stories, short essays, etc. The technique works by locating sentences in the text that are semantically related through anaphors, analyzing their semantics and exploiting the latter for resolving referents of the respective anaphors. The approach used here is based on matching constraints for the grammatical attributes of different words. The algorithm for anaphora resolution has been tested extensively. The accuracy of anaphora resolution is nearly 96% for simple sentences and for compound and complex sentences; the accuracy is of the order of 80%. The causes of the errors are analyzed and possible techniques for improvements are discussed.
Keywords
grammars; knowledge representation; natural languages; pattern matching; text analysis; Hindi language; anaphora resolution; knowledge representation; semantic text document analysis; Algorithm design and analysis; Data mining; Genetics; Information retrieval; Information technology; Natural languages; Performance analysis; Speech; Tellurium; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Natural Language Processing and Knowledge Engineering, 2007. NLP-KE 2007. International Conference on
Conference_Location
Beijing
Print_ISBN
978-1-4244-1611-0
Electronic_ISBN
978-1-4244-1611-0
Type
conf
DOI
10.1109/NLPKE.2007.4368070
Filename
4368070
Link To Document