DocumentCode
3756099
Title
Building a Corpus for Arabic Dialects Using Games with a Purpose
Author
Maya Osman;Caroline Sabty;Nada Sharaf;Slim Abdennadher
Author_Institution
Dept. in the German, Univ. in Cairo, Cairo, Egypt
fYear
2015
fDate
4/1/2015 12:00:00 AM
Firstpage
21
Lastpage
25
Abstract
There is a huge gap between the written form of Arabic, Modern Standard Arabic (MSA), and the different spoken Arabic dialects due to the big number of dialects. In addition, most Arabic data-sets are formed for MSA content. Traditional ways of identifying dialects of texts are time and money consuming. In addition, due to the morphological complexity of Arabic, the gender of the speaker may change structure of an Arabic sentence. Thus, dialects hold rich information (such as the origin of the speaker and the gender of the addressee). A Game With A Purpose (GWAP) called "3ammeya" is implemented to identify the dialects of Arabic sentences along with their MSA translations. Moreover, through the game, the gender of the speaker addressee are classified. The collected data will help construct an expandable and cheap corpus for dialect identification and translation to MSA.
Keywords
"Games","Pragmatics","Standards","Computers","Africa","Bridges","Crowdsourcing"
Publisher
ieee
Conference_Titel
Arabic Computational Linguistics (ACLing), 2015 First International Conference on
Print_ISBN
978-1-4673-9154-2
Type
conf
DOI
10.1109/ACLing.2015.10
Filename
7422275
Link To Document