Title :
Partially supervised learning for radical opinion identification in hate group web forums
Author :
Yang, Ming ; Chen, Hsinchun
Author_Institution :
Dept. Manage. Sci. & Eng., Harbin Inst. of Technol., Harbin, China
Abstract :
Web forums are frequently used as platforms for the exchange of information and opinions, as well as propaganda dissemination. But online content can be misused when the information being distributed, such as radical opinions, is unsolicited or inappropriate. However, radical opinion is highly hidden and distributed in Web forums, while non-radical content is unspecific and topically more diverse. It is costly and time consuming to label a large amount of radical content (positive examples) and non-radical content (negative examples) for training classification systems. Nevertheless, it is easy to obtain large volumes of unlabeled content in Web forums. In this paper, we propose and develop a topic-sensitive partially supervised learning approach to address the difficulties in radical opinion identification in hate group Web forums. Specifically, we design a labeling heuristic to extract high quality positive examples and negative examples from unlabeled datasets. The empirical evaluation results from two large hate group Web forums suggest that our proposed approach generally outperforms the benchmark techniques and exhibits more stable performance than its counterparts.
Keywords :
information dissemination; learning (artificial intelligence); pattern classification; social networking (online); classification systems training; hate group Web forums; nonradical content; propaganda dissemination; radical opinion identification; topic-sensitive partially supervised learning approach; unlabeled datasets; FCC; Feature extraction; Labeling; Supervised learning; Support vector machines; Text categorization; Training; Web forum; document classification; opinion mining; partially supervised learning;
Conference_Titel :
Intelligence and Security Informatics (ISI), 2012 IEEE International Conference on
Conference_Location :
Arlington, VA
Print_ISBN :
978-1-4673-2105-1
DOI :
10.1109/ISI.2012.6284099