A strategy for automatic moderation of a large data set of users comments

Author

Rodrigues Saude, Marcos ; de Medeiros Soares, Marcelo ; Gomes Basoni, Henrique ; Ciarelli, Patrick Marques ; Oliveira, Eunice

Author_Institution

Programa de Pos-Grad. em Inf. (PPGI), Univ. Fed. do Espirito Santo (UFES), Vitoria, Brazil

fYear

2014

fDate

15-19 Sept. 2014

Firstpage

1

Lastpage

7

Abstract

The increase use of social media and Web 2.0 are daily drawing more people to participate and express their point of views about a variety of subjects. However, there are a huge number of comments which are offensives and sometimes non-politically corrects and so must be hindered from coming up online. This is pushing the services providers to be more careful with the contents they publish to avoid judicial claims. This work proposes the use of automatic classification techniques to identify and only allow to go online harmless comments. We applied various techniques regarding with data processing, such as weighting of terms and the dimensionality reduction. All these techniques have been studied in order to model algorithms to be able to mimic well the human decisions regarding to the comments. The results indicate that we are able to mimic experts decision on 96.78% in the data set used.

Keywords

pattern classification; social networking (online); Web 2.0; automatic classification techniques; data processing; data set moderation; dimensionality reduction; social media; term weighting; users comments; Equations; Genetic algorithms; Mathematical model; Measurement; Sociology; Statistics; Vectors; automatic moderation; dimensionality reduction; feature selection; genetic algorithms;

fLanguage

English

Publisher

ieee

Conference_Titel

Computing Conference (CLEI), 2014 XL Latin American

Conference_Location

Montevideo

Type

conf

DOI

10.1109/CLEI.2014.6965181

Filename

6965181