• DocumentCode
    707677
  • Title

    Categorizing sentence structures for phrase level morphological analyzer for English to Hindi RBMT

  • Author

    Shukla, Seema ; Sinha, Usha

  • Author_Institution
    Dept. of Comput. Sci. & Eng., JSS Acad. of Tech. Educ., Noida, India
  • fYear
    2015
  • fDate
    3-4 March 2015
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Algorithms for morphological analyzers have evolved majorly around words. Since writing styles are changing due to impact of languages on each other, higher version of morphological analyzers are desired for various NLP systems such as Machine Translation, Knowledge Extraction, Information Retrieval, etc. Often word level morphological analyzers adhere to language grammars and knowledge set pertaining to GNP and dictionary. Some algorithms use phrasal dictionaries also. But, impact of languages on each other leads to changes in GNP, grammatical and phrasal usage of words. General morph algorithms cannot deal with impact of such usage of words or phrases. Therefore new generation of morph analyzers are desired to handle cross lingual impact. In this paper, methodology for English language morphological analyzer is proposed for interpretation of phrases and group of words to derive knowledge in Hindi for tourism domain. The methodology, although general, is oriented towards Machine Translation. Proposed methodology is based on creation of knowledge base for morph analyzers using formulations of FST and RTN. Using this methodology, ten categories of phrasal structures in sentences have been identified which when used in MA of RBMT would improve the functional efficiency of MT in producing correct translation.
  • Keywords
    grammars; language translation; natural language processing; English RBMT; FST; GNP; Hindi RBMT; NLP systems; RTN; cross lingual impact; finite state transducers; information retrieval; knowledge base; knowledge extraction; language grammars; phrasal dictionary; phrasal structures; phrase level morphological analyzer; recursive transition networks; rule based machine translation; sentence structure categorization; tourism domain; Cities and towns; Dictionaries; Economic indicators; Grammar; Knowledge based systems; Noise; Finite State Transducers (FST); Knowledge Base; Morphological Analyzer (MA); NLP processes; Recursive Transition Networks (RTN); Rule Based Machine Translation (RBMT); bilingual corpus; noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cognitive Computing and Information Processing (CCIP), 2015 International Conference on
  • Conference_Location
    Noida
  • Type

    conf

  • DOI
    10.1109/CCIP.2015.7100741
  • Filename
    7100741