Title : 
Evolving alphabet using genetic algorithms
         
        
            Author : 
Platos, Jan ; Kromer, Pavel
         
        
            Author_Institution : 
Dept. of Comput. Sci., VSB-Tech. Univ. of Ostrava, Ostrava Poruba, Czech Republic
         
        
        
        
        
        
            Abstract : 
Data compression algorithms were usually designed for data processing symbol by symbol. Symbols are usually characters or bytes, but several other techniques may be used. The most well-known approach is using syllables or words as symbols. Another approach is to take 2-grams, 3-grams or any n-grams as a symbols. All these approaches has pros and cons, but none of them is the best for any file. This paper describes approach of evolving alphabet from characters and 2-grams, which is optimal for compressed text files. The efficiency of the approach will be tested on three compression algorithms.
         
        
            Keywords : 
data compression; formal languages; genetic algorithms; text analysis; alphabet evolution; compressed text files; data compression algorithms; data processing; genetic algorithms; n-grams; syllables; Biological cells; Compression algorithms; Data compression; Dictionaries; Encoding; Genetic algorithms; Image coding; Burrows-Wheeler transformation; Huffman encoding; LZW; alphabet optimization; data compression; genetic algorithm;
         
        
        
        
            Conference_Titel : 
Nature and Biologically Inspired Computing (NaBIC), 2011 Third World Congress on
         
        
            Conference_Location : 
Salamanca
         
        
            Print_ISBN : 
978-1-4577-1122-0
         
        
        
            DOI : 
10.1109/NaBIC.2011.6089652