• DocumentCode
    515405
  • Title

    A statistical approach on Persian word sense disambiguation

  • Author

    Soltani, Mahmood ; Faili, Heshaam

  • Author_Institution
    Dept. of ECE, Univ. of Tehran, Tehran, Iran
  • fYear
    2010
  • fDate
    28-30 March 2010
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    This article studies different aspect of a new approach for resolving lexical ambiguities using statistical information gained from a monolingual corpus. The proposed approach resolves the problem of target word selection in an machine translation system. This Method is an unsupervised graph-based approach which uses a bilingual dictionary to find all possible translations of each ambiguous word in the source sentence (English) and then chooses the most appropriate alternative regarding the statistical information gathered from target language (Persian) corpora. Also, two new methods to measure the semantic similarity based on source and target language corpora are introduced. The experiments show that the unsupervised graph-based WSD which uses the proposed semantic similarity measures in the dependency graph outperforms all other methods on WSD for translating English to Persian words, significantly.
  • Keywords
    language translation; natural language processing; statistical analysis; Persian word sense disambiguation; bilingual dictionary; machine translation system; monolingual corpus; semantic similarity; statistical approach; target language corpora; unsupervised graph-based approach; Bioinformatics; Computational linguistics; Dictionaries; Humans; Information retrieval; Mutual information; Natural language processing; Natural languages; Semantic Web; Text mining; Centrality Algorithms; Mutual Information; Persian Language; Word Sense Disambiguation; graph;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Informatics and Systems (INFOS), 2010 The 7th International Conference on
  • Conference_Location
    Cairo
  • Print_ISBN
    978-1-4244-5828-8
  • Type

    conf

  • Filename
    5461799