Title :
Efficient lossless compression of trees and graphs
Author :
Chen, Shenfeng ; Reif, John H.
Author_Institution :
Dept. of Comput. Sci., Duke Univ., Durham, NC, USA
Abstract :
Summary form only given, as follows. Data compression algorithms have been widely used in many areas to meet the demand of storage and transfer of large size data. Most of the data compression algorithms regard the input as a sequence of binary numbers and represent the compressed data also as a binary sequence. However, in many areas such as programming languages (e.g. LISP and C) and compiler design, it is more desirable to have a compression algorithm which compresses a data structure while keeping a similar structure of the original data in the compressed form. In addition to reducing storage space, such compression also has the benefit of efficiently executing various operations (e.g. searching) in the compressed form. We study the problem of compressing a non-binary data structure (e.g. tree, undirected and directed graphs) in an efficient way while keeping a similar structure in the compressed form. To date, there has been no proven optimal algorithm for this problem. We use the idea of building LZW tree in LZW compression to compress a binary tree generated by a stationary ergodic source. The tree is parsed into subtrees using breadth first search and an optimal dictionary is constructed with each index pointing to a distinct subtree. We replace the parsed subtrees by dictionary indices to formed the compressed tree. We also extend our tree compression algorithm to compress undirected and directed acyclic graphs
Keywords :
data compression; directed graphs; tree data structures; tree searching; LZW compression; LZW tree; acyclic graphs; binary numbers; binary sequence; binary tree; breadth first search; compiler design; data compression algorithms; data storage; data structure; data transfer; dictionary indices; directed graphs; nonbinary data structure compression; optimal dictionary; parsed subtrees; programming languages; stationary ergodic source; undirected graphs; Algorithm design and analysis; Binary sequences; Buildings; Compression algorithms; Computer languages; Data compression; Data structures; Dictionaries; Tree data structures; Tree graphs;
Conference_Titel :
Data Compression Conference, 1996. DCC '96. Proceedings
Conference_Location :
Snowbird, UT
Print_ISBN :
0-8186-7358-3
DOI :
10.1109/DCC.1996.488356