DocumentCode
2971341
Title
Analyzing the performance differences between pattern matching and compressed pattern matching on texts
Author
Erdogan, Can ; Nusret Bulus, H. ; Diri, B.
fYear
2013
fDate
7-9 Nov. 2013
Firstpage
135
Lastpage
138
Abstract
In this study the statistics of pattern matching on text data and the statistics of compressed pattern matching on compressed form of the same text data are compared. A new application has been developed to count the character matching numbers in compressed and uncompressed texts individually. Also a new text compression algorithm that allows compressed pattern matching by using classical pattern matching algorithms without any change is presented in this paper. In this paper while the presented compression algorithm based on digram and trigram substitution has been giving about 30-35% compression factor, the duration of compressed pattern matching on compressed text is calculated less than the duration of pattern matching on uncompressed text. Also it is confirmed that the number of character comparison on compressed texts while doing a compressed pattern matching is less than the number of character comparison on uncompressed texts. Thus the aim of the developed compression algorithm is to point out the difference in text processing between compressed and uncompressed text and to form opinions for another applications.
Keywords
data compression; pattern matching; statistical analysis; text analysis; character matching numbers; classical pattern matching algorithms; compressed pattern matching; compression factor; digram substitution; statistics; text compression algorithm; text data; text processing; trigram substitution; uncompressed texts; Compression algorithms; Data compression; Dictionaries; Encoding; Force; Indexes; Pattern matching; Compressed Pattern Matching; Data compression; Pattern Substitution; Pattern matching;
fLanguage
English
Publisher
ieee
Conference_Titel
Electronics, Computer and Computation (ICECCO), 2013 International Conference on
Conference_Location
Ankara
Type
conf
DOI
10.1109/ICECCO.2013.6718247
Filename
6718247
Link To Document