DocumentCode :
3025787
Title :
Text Clustering by 2D Cellular Automata Based on the N-Grams
Author :
Hamou, Reda Mohamed ; Lehireche, Ahmed ; Lokbani, Ahmed Chaouki ; Rahmani, Mohamed
Author_Institution :
EEDIS Lab., Evolutionary Eng. & Distrib. Inf. Syst. Lab., Univ. Dr Tahar MOULAY of Saida, Saida, Algeria
fYear :
2010
fDate :
23-24 Oct. 2010
Firstpage :
271
Lastpage :
277
Abstract :
In this article we present a 2D cellular automaton (Class_AC) to solve a problem of text mining in the case of unsupervised classification (clustering). Before to experiment the cellular automaton, we vectorized our data indexing textual documents from the database REUTERS 21,578 by the approach of N-grams. The cellular automaton that we propose in this paper is a grid cell structure with a flat neighborhood arising from this structure (planar). Three functions of transitions were used to vary the automaton with four states for each cell. The results obtained show that the virtual machine parallel computing (Class_AC) effectively includes similar documents on near threshold. Section 1 gives an introduction, Section 2 presents representation of texts based on the n grams, Section 3 describes the cellular automaton for clustering, Section 4 shows the experimentation and comparison results and finally Section 5 gives a conclusion and perspectives.
Keywords :
cellular automata; indexing; parallel processing; pattern clustering; text analysis; 2D cellular automata; N-grams; data indexing; text clustering; textual documents; unsupervised classification; virtual machine parallel computing; Automata; Biological system modeling; Classification algorithms; Entropy; Laboratories; Support vector machine classification; Text mining; Cellular Automata; Data classification; biomimetic methods; clustering and segmentation; data mining; unsupervised classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cryptography and Network Security, Data Mining and Knowledge Discovery, E-Commerce & Its Applications and Embedded Systems (CDEE), 2010 First ACIS International Symposium on
Conference_Location :
Qinhuangdao
Print_ISBN :
978-1-4244-9595-5
Type :
conf
DOI :
10.1109/CDEE.2010.60
Filename :
5759331
Link To Document :
بازگشت