• DocumentCode
    442049
  • Title

    An unsupervised & statistical word sense tagging using bilingual sources

  • Author

    Oliveira, Francisco ; Wong, Fai ; Li, Yi-ping

  • Author_Institution
    Fac. of Sci. & Technol., Univ. of Macau, Macau
  • Volume
    6
  • fYear
    2005
  • fDate
    18-21 Aug. 2005
  • Firstpage
    3749
  • Abstract
    This paper presents an approach for choosing the correct translation of an ambiguous word in a given sentence. An unsupervised learning is applied and a non-aligned bilingual Portuguese to Chinese bilingual corpus is used in disambiguating word senses. The identification of the relationships between words is done by considering its surrounding words and their relative distance to tackle syntactical relationships. All the related words are then translated to the target language in finding out the correct senses of ambiguous words. The selection is based on a statistical and a mathematical model by assigning a score to each of the sense identified previously. After all the senses discovered, its semantic and syntactical information are converted into a set of rules and stored in the database for later use in the disambiguation process. Preliminary experiment results of the proposed method shows an improvement of 6% in assigning correctly the corresponding translation over the baseline method.
  • Keywords
    dictionaries; language translation; linguistics; statistical analysis; unsupervised learning; word processing; ambiguous word translation; bilingual dictionary; machine translation; nonaligned bilingual Portuguese-to-Chinese bilingual corpus; statistical word sense tagging; unsupervised learning; Costs; Databases; Dictionaries; Flip-flops; Labeling; Mathematical model; Natural language processing; Natural languages; Tagging; Unsupervised learning; Machine Translation; Word Sense Tagging;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on
  • Conference_Location
    Guangzhou, China
  • Print_ISBN
    0-7803-9091-1
  • Type

    conf

  • DOI
    10.1109/ICMLC.2005.1527592
  • Filename
    1527592