Title :
Categorizing sentence structures for phrase level morphological analyzer for English to Hindi RBMT
Author :
Shukla, Seema ; Sinha, Usha
Author_Institution :
Dept. of Comput. Sci. & Eng., JSS Acad. of Tech. Educ., Noida, India
Abstract :
Algorithms for morphological analyzers have evolved majorly around words. Since writing styles are changing due to impact of languages on each other, higher version of morphological analyzers are desired for various NLP systems such as Machine Translation, Knowledge Extraction, Information Retrieval, etc. Often word level morphological analyzers adhere to language grammars and knowledge set pertaining to GNP and dictionary. Some algorithms use phrasal dictionaries also. But, impact of languages on each other leads to changes in GNP, grammatical and phrasal usage of words. General morph algorithms cannot deal with impact of such usage of words or phrases. Therefore new generation of morph analyzers are desired to handle cross lingual impact. In this paper, methodology for English language morphological analyzer is proposed for interpretation of phrases and group of words to derive knowledge in Hindi for tourism domain. The methodology, although general, is oriented towards Machine Translation. Proposed methodology is based on creation of knowledge base for morph analyzers using formulations of FST and RTN. Using this methodology, ten categories of phrasal structures in sentences have been identified which when used in MA of RBMT would improve the functional efficiency of MT in producing correct translation.
Keywords :
grammars; language translation; natural language processing; English RBMT; FST; GNP; Hindi RBMT; NLP systems; RTN; cross lingual impact; finite state transducers; information retrieval; knowledge base; knowledge extraction; language grammars; phrasal dictionary; phrasal structures; phrase level morphological analyzer; recursive transition networks; rule based machine translation; sentence structure categorization; tourism domain; Cities and towns; Dictionaries; Economic indicators; Grammar; Knowledge based systems; Noise; Finite State Transducers (FST); Knowledge Base; Morphological Analyzer (MA); NLP processes; Recursive Transition Networks (RTN); Rule Based Machine Translation (RBMT); bilingual corpus; noise;
Conference_Titel :
Cognitive Computing and Information Processing (CCIP), 2015 International Conference on
Conference_Location :
Noida
DOI :
10.1109/CCIP.2015.7100741