• DocumentCode
    3730289
  • Title

    Building a semantic role labelling system for Vietnamese

  • Author

    Thai-Hoang Pham;Xuan-Khoai Pham;Phuong Le-Hong

  • Author_Institution
    FPT University, Vietnam
  • fYear
    2015
  • Firstpage
    77
  • Lastpage
    84
  • Abstract
    Semantic role labelling (SRL) is a task in natural language processing which detects and classifies the semantic arguments associated with the predicates of a sentence. It is an important step towards understanding the meaning of a natural language. There exists SRL systems for well-studied languages like English, Chinese or Japanese but there is not any such system for the Vietnamese language. In this paper, we present the first SRL system for Vietnamese with encouraging accuracy. We first demonstrate that a simple application of SRL techniques developed for English could not give a good accuracy for Vietnamese. We then introduce a new algorithm for extracting candidate syntactic constituents, which is much more accurate than the common node-mapping algorithm usually used in the identification step. Finally, in the classification step, in addition to the common linguistic features, we propose novel and useful features for use in SRL. Our SRL system achieves an F1 score of 73.53% on the Vietnamese PropBank corpus. This system, including software and corpus, is available as an open source project and we believe that it is a good baseline for the development of future Vietnamese SRL systems.
  • Publisher
    ieee
  • Conference_Titel
    Digital Information Management (ICDIM), 2015 Tenth International Conference on
  • Type

    conf

  • DOI
    10.1109/ICDIM.2015.7381877
  • Filename
    7381877