DocumentCode
2402909
Title
Filtering Harmful Sentences Based on Multiple Word Co-occurrence
Author
Ando, Satoshi ; Fujii, Yutaro ; Ito, Takayuki
fYear
2010
fDate
18-20 Aug. 2010
Firstpage
581
Lastpage
586
Abstract
BBSs (Bulletin Board Systems) and Social Network Services (SNS) have been increasing in recenter years. In such systems, users can easily upload and share their own information via personal computers, and also cellular phones. However some information, such as adult content, is not appropriate for all users, notably children. Many SNS and BBS providing companies have been trying to monitor and check harmful information which comes from their users. At the current stage, these companies manually check the users´ sentences before being published. For these companies, even partial automatization of such watching tasks will reduce the huge labor cost. Based on the above motivation and background, in this paper, we focus on filtering harmful text information. In this paper, we have built a program, which can calculate three words co-occurrences, and demonstrate that three-word co-occurrence is more useful than two-word cooccurrence by experimentations.
Keywords
Internet; information filtering; social networking (online); BBS; SNS; bulletin board systems; filtering harmful sentences; harmful text information filtering; multiple word cooccurrence; social network services; Bayesian methods; Companies; Context; Databases; Filtering; Postal services; Software; Classification; Co-occurrence; Text Filtering; Text mining;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer and Information Science (ICIS), 2010 IEEE/ACIS 9th International Conference on
Conference_Location
Yamagata
Print_ISBN
978-1-4244-8198-9
Type
conf
DOI
10.1109/ICIS.2010.96
Filename
5590998
Link To Document