Author :
Ando, Satoshi ; Fujii, Yutaro ; Ito, Takayuki
Abstract :
BBSs (Bulletin Board Systems) and Social Network Services (SNS) have been increasing in recenter years. In such systems, users can easily upload and share their own information via personal computers, and also cellular phones. However some information, such as adult content, is not appropriate for all users, notably children. Many SNS and BBS providing companies have been trying to monitor and check harmful information which comes from their users. At the current stage, these companies manually check the users´ sentences before being published. For these companies, even partial automatization of such watching tasks will reduce the huge labor cost. Based on the above motivation and background, in this paper, we focus on filtering harmful text information. In this paper, we have built a program, which can calculate three words co-occurrences, and demonstrate that three-word co-occurrence is more useful than two-word cooccurrence by experimentations.
Keywords :
Internet; information filtering; social networking (online); BBS; SNS; bulletin board systems; filtering harmful sentences; harmful text information filtering; multiple word cooccurrence; social network services; Bayesian methods; Companies; Context; Databases; Filtering; Postal services; Software; Classification; Co-occurrence; Text Filtering; Text mining;