مرکز منطقه ای اطلاع رساني علوم و فناوري - Sensibility estimation method for youth slang by using sensibility co-occurrence feature vector obtained from microblog

DocumentCode :

3734078

Title :

Sensibility estimation method for youth slang by using sensibility co-occurrence feature vector obtained from microblog

Author :

Kazuyuki Matsumoto;Minora Yoshida;Kenji Kita

Author_Institution :

Faculty of Engineering, Tokushima University, Tokushima city, Japan

fYear :

2015

Firstpage :

473

Lastpage :

478

Abstract :

Social networking sites such as Twitter provide more opportunities to express what people think or intend in short text. In short text, abbreviations such as "ASAP" or "joinus" and emoticons are often used. Because these expressions are not registered into the existing dictionaries, these are analyzed as unknown expressions. That can be a bottleneck for improving accuracy of reputation analysis in text mining. To use context for unknown word clustering is a major method, however, it usually requires word segmentation process and it has weakness for split errors of unknown expressions such as youth slang. In this paper, we proposed a method to obtain the appropriate context even though unknown expressions cause split errors and estimate sensibility expressed in the text. Because the dimensions of the obtained context vector were enormous, we also proposed a method to create a feature vector based on the co-occurrence of the sensibility words as simple expression with low dimension. As an evaluation experiment, the proposed method showed certain accuracy even with the small training data.

Keywords :

"Dictionaries","Twitter","Context","Estimation","Feature extraction","Training data","Thesauri"

Publisher :

ieee

Conference_Titel :

Computer and Communications (ICCC), 2015 IEEE International Conference on

Print_ISBN :

978-1-4673-8125-3

Type :

conf

DOI :

10.1109/CompComm.2015.7387618

Filename :

7387618

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3734078