Title :
Big data clustering validity
Author :
Tlili, Mania ; Hamdani, Tarek M.
Author_Institution :
REGIM Lab.: Res. Groups on Intell. Machines, Univ. of Sfax, Sfax, Tunisia
Abstract :
Nowadays we communicate in a digital universe. In fact the amount of data (structured and unstructured) is exploding. That´s what we call Big Data. The voluminous data are in the most of cases noisy and overlapping, their clustering makes critical challenges. In addition validating resulting partitions is a serious problem. In this paper we present a new fuzzy validity index able to interpret the best partition of Big Data clustering. Called Fuzzy Validity Index with Noise-Overlap Separation (FVINOS), this new technique provides sufficient interpretation of the properties of the Big Data by detecting the overall geometric structure within and between clusters. The main contribution of FVINOS is to define a crisp and fuzzy clustering validation taking in account the structure of Big Data sets.
Keywords :
Big Data; fuzzy set theory; pattern clustering; Big Data clustering validity; FVINOS; crisp-fuzzy clustering validation; fuzzy validity index-with-noise-overlap separation; geometric structure detection; noisy-overlapping data; structured data; unstructured data; Big data; Classification algorithms; Clustering algorithms; Indexes; Noise; Optimized production technology; Partitioning algorithms; Big Data; Clustering; Noise; Overlap; Validity indexes;
Conference_Titel :
Soft Computing and Pattern Recognition (SoCPaR), 2014 6th International Conference of
Conference_Location :
Tunis
DOI :
10.1109/SOCPAR.2014.7008031