Title :
Evaluating effect of context window size, stemming and stop word removal on Hindi word sense disambiguation
Author :
Singh, Sushil ; Siddiqui, Tanveer J.
Author_Institution :
Dept. of Electron. & Commun., Univ. of Allahabad, Allahabad, India
Abstract :
This paper investigates the effects of stemming, stop word removal and size of context window on Hindi word sense disambiguation. The evaluation has been made on a manually created sense tagged corpus consisting of Hindi words (nouns). The sense definition has been obtained from Hindi WordNet, which is an important lexical resource for Hindi language developed at IIT Bombay. The maximum observed precision of 54.81% on 1248 test instances corresponds to the case when both stemming and stop words elimination has been performed. The % improvement in precision and recall is 9.24% and 12.68% over the baseline performance.
Keywords :
natural language processing; Hindi WordNet; Hindi language; Hindi word sense disambiguation; IIT Bombay; context window size; context window stemming; effect evaluation; lexical resource; stop word removal; stop words elimination; Accuracy; Context; Dictionaries; Natural language processing; Semantics; Silicon; Vectors; Dictionary-based disambiguation; Hindi word sense disambiguation; Lesk-based Hindi WSD; Word sense disambiguation;
Conference_Titel :
Information Retrieval & Knowledge Management (CAMP), 2012 International Conference on
Conference_Location :
Kuala Lumpur
Print_ISBN :
978-1-4673-1091-8
DOI :
10.1109/InfRKM.2012.6204972