Title :
Data compression using encrypted text
Author :
Franceschini, Robert ; Mukherjee, Amar
Author_Institution :
Dept. of Comput. Sci., Univ. of Central Florida, Orlando, FL, USA
Abstract :
We present an algorithm for text compression. The basic idea of our algorithm is to define a unique encryption or signature of each word in the dictionary by replacing certain characters in the words by a special character “*” and retaining a few characters so that the word is still retrievable. For any encrypted text the most frequently used character is “*” and the standard compression algorithms can exploit this redundancy in an effective way. We advocate the following compression paradigm: given a compression algorithm A and a text T, we apply the same algorithm A on an encrypted text *T and retrieve the original text via a dictionary which maps the decompressed text *T to the original text T. We report better results for most widely used compression algorithms such as Huffman, LZW, arithmetic, unix compress, gnu-zip with respect to a text corpus. The compression rates using these algorithms are much better than the dictionary based methods reported in the literature. One basic assumption of our algorithm is that the system has access to a dictionary of words used in all the texts along with a corresponding “cryptic” dictionary. The cost of this dictionary is amortized over the compression savings for all the text files handled by the organization. If two organizations wish to exchange information using our compression algorithm, they must share a common dictionary. We compare our methods with other dictionary based methods and present future research problems
Keywords :
cryptography; data compression; glossaries; word processing; common dictionary; compression rates; data compression; decompressed text; dictionary based methods; encrypted text; redundancy; special character; standard compression algorithms; text compression; text corpus; text files; Arithmetic; Bandwidth; Compression algorithms; Computer science; Costs; Cryptography; Data compression; Dictionaries; Explosions; Memory;
Conference_Titel :
Digital Libraries, 1996. ADL '96., Proceedings of the Third Forum on Research and Technology Advances in
Conference_Location :
Washington, DC
Print_ISBN :
0-8186-7403-2
DOI :
10.1109/ADL.1996.502523