DocumentCode :
2484199
Title :
Part of Speech Tagging for Hindi Corpus
Author :
Mishra, Nidhi ; Mishra, Amit
Author_Institution :
Buddha Inst. of Technol., Gorakhpur, India
fYear :
2011
fDate :
3-5 June 2011
Firstpage :
554
Lastpage :
558
Abstract :
The wide utilization of internet for making search of information has got emerging use of computational linguistics as most of the search systems uses bag of words mode which causes problem in retrieval due to polysemy, homonymy, synonymy[9][3]. This has lead to shift in the accepted boundary between what kinds of query information are submitted by humans and what kinds further intreprations in form of annotation of query information can be done so as to get better results[1][4]. In this regards the process of annotating the words in a text in accordance to a particular part of speech is the objective of this paper. further POS tagging is much tougher than making a list of words and their parts of speech, as most words tend to have more than one part of speech in different scenarios and some parts of speech of these words are rather complex or unspoken[5] [6]. There are large numbers of POS tagger available for english language which has got satisfactory performance but cannot be applied to hindi language due to structural differences[8]. This paper aims at part of speech tagging for hindi corpus as large no of hindi documents are growing on internet.
Keywords :
computational linguistics; document handling; query processing; Hindi corpus; Hindi document; Internet; POS tagging; bag-of-words; computational linguistics; homonymy; information search; part-of-speech tagging; polysemy; query information; synonymy; Computational linguistics; Internet; Software; Speech; Speech processing; Tagging; Testing; Part of speech tagging; corpus; dictionary look up; lexicon;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Communication Systems and Network Technologies (CSNT), 2011 International Conference on
Conference_Location :
Katra, Jammu
Print_ISBN :
978-1-4577-0543-4
Electronic_ISBN :
978-0-7695-4437-3
Type :
conf
DOI :
10.1109/CSNT.2011.118
Filename :
5966508
Link To Document :
بازگشت