DocumentCode :
172512
Title :
NormAPI: An API for normalizing Filipino shortcut texts
Author :
Nocon, Nicco ; Cuevas, Gems ; Magat, Darwin ; Suministrado, Peter ; Cheng, C.-C.
Author_Institution :
Coll. of Comput. Studies, De La Salle Univ. Manila, Manila, Philippines
fYear :
2014
fDate :
20-22 Oct. 2014
Firstpage :
207
Lastpage :
210
Abstract :
As the number of Internet and mobile phone users grow, texting and chatting have become popular means of communication. Reaching new heights, the extensive use of cellphones and Internet led into the creation of a new language, where words are transformed and made shorter using various styles. Shortcut texting is used in informal venues such as SMS, online, chat rooms, forums and posts in social networks. Huge amounts of data originating from these informal sources can be utilized for various tasks in machine learning and data analytics. As these data may be written in shortcut forms, text normalization is necessary before NLP actions such as information extraction, data mining, text summarization, opinion classification, and even bilingual translations can be fully achieved, by acting as a preprocessing stage that transforms all informal texts back to their original and more understandable forms. This paper is about NormAPI, an API for normalizing Filipino shortcut texts. NormAPI primarily intends to be used as a preprocessing system that corrects informalities in shortcut texts before they are handed for complete data processing.
Keywords :
application program interfaces; learning (artificial intelligence); natural language processing; pattern classification; text analysis; Filipino shortcut texts; NLP action; NormAPI; application program interface; bilingual translation; chatting communication; data analytics; data mining; information extraction; machine learning; natural language processing; opinion classification; social networks; text normalization; text summarization; texting communication; Computational linguistics; Context; Dictionaries; Educational institutions; Face; Internet; Training; Dictionary Substitution; Filipino; Normalization; Preprocessing; Shortcut texts; Statistical Machine Translation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Asian Language Processing (IALP), 2014 International Conference on
Conference_Location :
Kuching
Type :
conf
DOI :
10.1109/IALP.2014.6973494
Filename :
6973494
Link To Document :
بازگشت