Title :
Compression and data mining
Author :
Simovici, Dan A. ; Ping Chen ; Tong Wang ; Pletea, Dan
Author_Institution :
Univ. of Massachusetts Boston, Boston, MA, USA
Abstract :
Data compression plays an important role in data mining in assessing the minability of data and a modality of evaluating similarities between complex objects. We focus on compressibility of strings of symbols and on using compression in computing similarity in text corpora; also we propose a novel approach for assessing the quality of text summarization.
Keywords :
data compression; data mining; string matching; text analysis; data compression; data mining; text corpora; text summarization quality assessment; Abstracts; Compression algorithms; Correlation; Data mining; Sorting; Vectors; Vocabulary; Thue-Morse sequence; compression ratio; lemmatizing; lossless compression; stemming;
Conference_Titel :
Computing, Networking and Communications (ICNC), 2015 International Conference on
Conference_Location :
Garden Grove, CA
DOI :
10.1109/ICCNC.2015.7069404