Title : 
A fast lossless compression algorithm for Arabic textual images
         
        
        
            Author_Institution : 
Image Process. & Multimedia Lab., Univ. of Northern British Columbia, Prince George, BC, Canada
         
        
        
        
        
        
            Abstract : 
In recent years, an unparalleled volume of textual information was transported over the Internet via email, chatting, blogging, twittering, digital libraries, and information retrieval systems. As the volume of text data has exceeded 40% of the total volume of traffic on the Internet, compressing textual data becomes imperative. Many algorithms were introduced and employed for this purpose including Huffman encoding, arithmetic encoding, the Ziv-Lempel family, Dynamic Markov Compression, and Burrow-Wheeler Transform. In this paper, a novel algorithm for compressing textual images is presented. The algorithm comprises of two parts: (i) a fixed-to-variable codebook; and (ii) row and column reduction coding scheme, RCRC. Simulation results on a large number of Arabic textual images show that this algorithm has a compression ratio of approximately 87%, which exceeds published results including those of JBIG2.
         
        
            Keywords : 
Huffman codes; Internet; Markov processes; arithmetic codes; data compression; digital libraries; document image processing; electronic mail; information retrieval systems; natural language processing; social networking (online); text analysis; Arabic textual images; Burrow-Wheeler transform; Huffman encoding; Internet traffic; JBIG2; RCRC; Ziv-Lempel family; arithmetic encoding; blogging; chatting; column reduction coding scheme; compression ratio; digital library; dynamic Markov compression; email; fixed-to-variable codebook; information retrieval systems; lossless compression algorithm; row reduction coding scheme; text data; textual data compression; textual image compression; textual information; twittering; unparalleled volume; Algorithm design and analysis; Conferences; Entropy; Image coding; Matrix converters; Morphology; Vectors; Arabic text compression; binary image compression; entropy; written text compression;
         
        
        
        
            Conference_Titel : 
Signal and Image Processing Applications (ICSIPA), 2011 IEEE International Conference on
         
        
            Conference_Location : 
Kuala Lumpur
         
        
            Print_ISBN : 
978-1-4577-0243-3
         
        
        
            DOI : 
10.1109/ICSIPA.2011.6144069