• DocumentCode
    3756099
  • Title

    Building a Corpus for Arabic Dialects Using Games with a Purpose

  • Author

    Maya Osman;Caroline Sabty;Nada Sharaf;Slim Abdennadher

  • Author_Institution
    Dept. in the German, Univ. in Cairo, Cairo, Egypt
  • fYear
    2015
  • fDate
    4/1/2015 12:00:00 AM
  • Firstpage
    21
  • Lastpage
    25
  • Abstract
    There is a huge gap between the written form of Arabic, Modern Standard Arabic (MSA), and the different spoken Arabic dialects due to the big number of dialects. In addition, most Arabic data-sets are formed for MSA content. Traditional ways of identifying dialects of texts are time and money consuming. In addition, due to the morphological complexity of Arabic, the gender of the speaker may change structure of an Arabic sentence. Thus, dialects hold rich information (such as the origin of the speaker and the gender of the addressee). A Game With A Purpose (GWAP) called "3ammeya" is implemented to identify the dialects of Arabic sentences along with their MSA translations. Moreover, through the game, the gender of the speaker addressee are classified. The collected data will help construct an expandable and cheap corpus for dialect identification and translation to MSA.
  • Keywords
    "Games","Pragmatics","Standards","Computers","Africa","Bridges","Crowdsourcing"
  • Publisher
    ieee
  • Conference_Titel
    Arabic Computational Linguistics (ACLing), 2015 First International Conference on
  • Print_ISBN
    978-1-4673-9154-2
  • Type

    conf

  • DOI
    10.1109/ACLing.2015.10
  • Filename
    7422275