DocumentCode :
3721175
Title :
Language tweet characteristics of Indonesian citizens
Author :
Ahmad Fathan Hidayatullah
Author_Institution :
Department of Informatics. Universitas Islam Indonesia, UII, Yogyakarta, Indonesia
fYear :
2015
Firstpage :
397
Lastpage :
401
Abstract :
Indonesia is a wide country which has thousands of islands, hundred languages and dialects. These conditions cause many habits and behaviour to the people, including their activities in social media. Twitter and other social media have no language rules for users. Therefore, people are able to write everything very free without any regulations when they are posting their tweets. Generally, there are five types of writing that presented in the dataset such as tweet that written in the normal form of Bahasa, mixed Bahasa with local language, mixed Bahasa with foreign language, contains abbreviations, and contains slang words. Moreover, this investigation has found sixteen characteristics of Indonesian tweet where some of them are the combination of the five writing styles. By understanding the characteristics of writing style in Twitter messages, we proposed the algorithm in the pre-processing step to alter the non-standard words into standard form in Bahasa Indonesia.
Keywords :
"Twitter","Media","Standards","Pragmatics","Dictionaries","Speech"
Publisher :
ieee
Conference_Titel :
Science and Technology (TICST), 2015 International Conference on
Type :
conf
DOI :
10.1109/TICST.2015.7369393
Filename :
7369393
Link To Document :
بازگشت