DocumentCode :
2061564
Title :
Dynamic word based text compression
Author :
Ng, K.S. ; Cheng, L.M. ; Wong, C.H.
Author_Institution :
Dept. of Electron. Eng., City Univ. of Hong Kong, Kowloon, Hong Kong
Volume :
1
fYear :
1997
fDate :
18-20 Aug 1997
Firstpage :
412
Abstract :
We propose a dynamic text compression technique with a back searching algorithm and a new storage protocol. Codes being encoded are divided into three types namely copy, literal and hybrid codes. Multiple dictionaries are adopted and each of them has a linked sub-dictionary. Each dictionary has a portion of pre-defined words i.e. the most frequent words and the rest of the entries will depend on the message. A hashing function developed by Pearson (1990) is adopted. It serves two purposes. Firstly, it is used to initialize the dictionary. Secondly, it is used as a quick search to a particular word. By using this scheme, the spaces between words do not need to be considered. At the decoding side, a space character will be appended after each word is decoded. Therefore, the redundancy of space can also be compressed. The result shows that the original message will not be expanded even if we have poor dictionary design
Keywords :
backtracking; data compression; document image processing; file organisation; glossaries; image coding; memory protocols; search problems; back searching algorithm; copy codes; decoding; dictionaries; dynamic word based text compression; encoding; hashing function; hybrid codes; literal codes; message; redundancy; space character; storage protocol; Data compression; Decoding; Dictionaries; Probability; Protocols; Road transportation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 1997., Proceedings of the Fourth International Conference on
Conference_Location :
Ulm
Print_ISBN :
0-8186-7898-4
Type :
conf
DOI :
10.1109/ICDAR.1997.619880
Filename :
619880
Link To Document :
بازگشت