DocumentCode :
1801289
Title :
Data compression using encrypted text
Author :
Franceschini, Robert ; Mukherjee, Amar
Author_Institution :
Dept. of Comput. Sci., Univ. of Central Florida, Orlando, FL, USA
fYear :
1996
fDate :
13-15, May 1996
Firstpage :
130
Lastpage :
138
Abstract :
We present an algorithm for text compression. The basic idea of our algorithm is to define a unique encryption or signature of each word in the dictionary by replacing certain characters in the words by a special character “*” and retaining a few characters so that the word is still retrievable. For any encrypted text the most frequently used character is “*” and the standard compression algorithms can exploit this redundancy in an effective way. We advocate the following compression paradigm: given a compression algorithm A and a text T, we apply the same algorithm A on an encrypted text *T and retrieve the original text via a dictionary which maps the decompressed text *T to the original text T. We report better results for most widely used compression algorithms such as Huffman, LZW, arithmetic, unix compress, gnu-zip with respect to a text corpus. The compression rates using these algorithms are much better than the dictionary based methods reported in the literature. One basic assumption of our algorithm is that the system has access to a dictionary of words used in all the texts along with a corresponding “cryptic” dictionary. The cost of this dictionary is amortized over the compression savings for all the text files handled by the organization. If two organizations wish to exchange information using our compression algorithm, they must share a common dictionary. We compare our methods with other dictionary based methods and present future research problems
Keywords :
cryptography; data compression; glossaries; word processing; common dictionary; compression rates; data compression; decompressed text; dictionary based methods; encrypted text; redundancy; special character; standard compression algorithms; text compression; text corpus; text files; Arithmetic; Bandwidth; Compression algorithms; Computer science; Costs; Cryptography; Data compression; Dictionaries; Explosions; Memory;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Digital Libraries, 1996. ADL '96., Proceedings of the Third Forum on Research and Technology Advances in
Conference_Location :
Washington, DC
Print_ISBN :
0-8186-7403-2
Type :
conf
DOI :
10.1109/ADL.1996.502523
Filename :
502523
Link To Document :
بازگشت