DocumentCode :
3460957
Title :
An efficient data compression scheme based on semi-adaptive Huffman coding for moderately large Chinese text files
Author :
Ong, Ghim Hwee ; Huang, Shell Ying
Author_Institution :
Dept. of Inf. Syst. & Comput. Sci., Nat. Univ. of Singapore, Singapore
fYear :
1995
fDate :
3-7 Jul 1995
Firstpage :
332
Lastpage :
336
Abstract :
This paper presents a data compression scheme for Chinese text files. Due to the skewness of the distribution of Chinese ideograms, the Huffman coding method is adopted. By storing the Huffman tree in the coding table and representing the Huffman tree using the Zaks sequence, the algorithm produces significant improvement on the compression results. The proposed method is evaluated by comparing its performance with three well-known compression algorithms and an algorithm specially designed to compress the coding table. This algorithm should also be applicable to other ideogram-based or oriental language texts. Also, it has the potential to reduce the dictionary size in a bigram or trigram-based semi-adaptive compression scheme for English texts
Keywords :
Huffman codes; adaptive codes; data compression; Chinese ideograms; Chinese text files; Huffman tree; Zaks sequence; binary tree coding; data compression scheme; ideogram-based texts; oriental language texts; semi-adaptive Huffman coding; Algorithm design and analysis; Compression algorithms; Computer science; Data compression; Dictionaries; Encoding; Frequency; Huffman coding; Information systems; Natural languages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Networks, 1995. Theme: Electrotechnology 2000: Communications and Networks. [in conjunction with the] International Conference on Information Engineering., Proceedings of IEEE Singapore International
Print_ISBN :
0-7803-2579-6
Type :
conf
DOI :
10.1109/SICON.1995.526073
Filename :
526073
Link To Document :
بازگشت