DocumentCode :
3362954
Title :
A Parallel Comparator of Documents
Author :
Ksouri, Sonia Alouane ; Hidri, Minyar Sassi ; Barkaoui, Kamel
Author_Institution :
Ecole Nat. d´Ingonieurs de Tunis, Univ. Tunis El Manar, Tunis, Tunisia
fYear :
2013
fDate :
26-30 Aug. 2013
Firstpage :
48
Lastpage :
52
Abstract :
Documents, sentences and words clustering are well studied problems. Most existing algorithms cluster documents, sentences and words separately but not simultaneously. However, when analyzing large textual corpuses, the amount of data to be processed in a single machine is usually limited by the main memory available, and the increase of these data to be analyzed leads to increasing computational workload. In this paper we present a parallel fuzzy triadic similarity measure called PFT-Sim, to calculate fuzzy memberships in a context of document co-clustering based on a parallel programming architecture. It allows computing simultaneously fuzzy co-similarity matrices between documents/sentences and sentences/words. Each one is built on the basis of the others. The PFT-SIM model provides a parallel data analysis strategy and divides the similarity computing task into parallel sub-tasks to tackle efficiency and scalability problems.
Keywords :
data analysis; fuzzy set theory; matrix algebra; parallel programming; pattern clustering; text analysis; PFT-Sim; document co-clustering; fuzzy co-similarity matrices; fuzzy memberships; large textual corpuses; parallel comparator; parallel data analysis strategy; parallel fuzzy triadic similarity measure; parallel programming architecture; parallel sub-tasks; sentence clustering; similarity computing task; words clustering; Clustering algorithms; Complexity theory; Computational modeling; Computer architecture; Data models; Parallel processing; Text mining; Document co-clustering; Fuzzy sets; Parallel computing; Text Mining; Three-partite graph; multi-thread architecture;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Database and Expert Systems Applications (DEXA), 2013 24th International Workshop on
Conference_Location :
Los Alamitos, CA
ISSN :
1529-4188
Print_ISBN :
978-0-7695-5070-1
Type :
conf
DOI :
10.1109/DEXA.2013.13
Filename :
6621344
Link To Document :
بازگشت