DocumentCode :
3079840
Title :
Preprocessors in NLP applications: In the context of English to Malayalam Machine Translation
Author :
Sunil, R. ; Jayan, V. ; Bhadran, V.K.
Author_Institution :
Language Technol. Centre, Centre for Dev. of Adv. Comput. (C-DAC), Trivandrum, India
fYear :
2012
fDate :
7-9 Dec. 2012
Firstpage :
221
Lastpage :
226
Abstract :
Preprocessing the input text is an essential component in a Natural Language Processing (NLP) system. We are discussing the relevance of the preprocessors in the context of Machine Translation system developed by us based on AnglaBharati Technology. Whenever we come across with text for translation we encounter with the special formats in an input text and getting its appropriate translation is a difficult task. Sometimes they may not have definite grammatical structure and may not be able to handle using a language rule. This paper present a strategy to identify the special formats in English text like date, currency, number, time, quotes, acronym, parenthesis, etc for a rule based English Malayalam Machine Aided Translation system. AnglaBharati is a pattern directed rule based system with context free grammar like structure for English which generates a pseudo target for group of Indian languages. Preprocessor is one of the main modules in this translation System. Here it manipulates the English input text to produce an input which is more suitable for an engine to generate appropriate translation. Extensive research is carried out in this area to disambiguate and process the input text in order to get more suitable output from the translation engine.
Keywords :
language translation; AnglaBharati technology; English; Malayalam; NLP applications; Natural Language Processing system; machine translation; preprocessors; pseudo target; Context; Data preprocessing; Engines; Helium; Knowledge based systems; Natural language processing; Terminology; AnglaBharati; machine translation; preprocessor;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
India Conference (INDICON), 2012 Annual IEEE
Conference_Location :
Kochi
Print_ISBN :
978-1-4673-2270-6
Type :
conf
DOI :
10.1109/INDCON.2012.6420619
Filename :
6420619
Link To Document :
بازگشت