• DocumentCode
    3453290
  • Title

    A comparison of statistical measures for the automatic identification of Persian light verb constructions

  • Author

    Askarian, Narjes ; Fazly, Afsaneh ; Hamzeh, Ali

  • Author_Institution
    Dept. of Comput. Eng., Univ. of Shiraz, Shiraz, Iran
  • fYear
    2012
  • fDate
    2-3 May 2012
  • Firstpage
    479
  • Lastpage
    483
  • Abstract
    A multiword expression (MWE) is a combination of words with a meaning beyond the compositional combination of the part meanings. Light verb constructions (LVCs) are a type of MWE that are widely used in many languages, including English, Spanish, French, Japanese, Chinese, Urdu, and Persian, among others. An LVC consists of a semantically-light basic verb - such as take in English and gozâshtan (meaning `to put´) in Persian - combined with another word that can be an adjective, a prepositional phrase, or a noun. Examples of LVCs are take a walk in English, and ehteram gozâshtan in Persian (lit. put respect, meaning `t o respect´). In particular, most verbs in Persian are of the form of LVCs, and thus many linguistic studies have examined their properties. There is, however, not much computational work on the automatic identification and processing of Persian LVCs, despite its importance for the development of natural language processing systems, such as summarization and machine translation. In this study, we focus on the most common form of LVCs in Persian, in which a noun is combined with one of five commonly-used light verbs to form an LVC. Two standard measures of association are used as features of candidates as well as some linguistically-informed measures. We also propose a position-based fixedness measure and some translation-based measures based on the special properites of Persian LVCs and their translation to English. Our results show the good performance of the measures for identifying Persian LVCs.
  • Keywords
    language translation; natural language processing; statistical analysis; Chinese; English; French; Japanese; LVC; MWE; Persian; Spanish; Urdu; automatic Persian light verb construction identification; ehteram gozashtan; linguistically-informed measures; machine translation; multiword expression; natural language processing systems; position-based fixedness measure; semantically-light basic verb; statistical measures; take a walk; translation-based measures; Computational linguistics; Conferences; Educational institutions; Frequency measurement; Gravity; Pragmatics; Syntactics; Corpus-based statistical measures; Multiword expressions; Natural language processing; Persian Light verb constructions;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Artificial Intelligence and Signal Processing (AISP), 2012 16th CSI International Symposium on
  • Conference_Location
    Shiraz, Fars
  • Print_ISBN
    978-1-4673-1478-7
  • Type

    conf

  • DOI
    10.1109/AISP.2012.6313795
  • Filename
    6313795