DocumentCode :
3049100
Title :
Compressing the graph structure of the Web
Author :
Suel, Torsten ; Yuan, Jun
Author_Institution :
CIS Dept., Polytech. Univ., Brooklyn, NY, USA
fYear :
2001
fDate :
2001
Firstpage :
213
Lastpage :
222
Abstract :
A large amount of research has recently focused on the graph structure (or link structure) of the World Wide Web. This structure has proven to be extremely useful for improving the performance of search engines and other tools for navigating the Web. However, since the graphs in these scenarios involve hundreds of millions of nodes and even more edges, highly space-efficient data structures are needed to fit the data in memory. A first step in this direction was done by the DEC connectivity server, which stores the graph in compressed form. We describe techniques for compressing the graph structure of the Web, and give experimental results of a prototype implementation. We attempt to exploit a variety of different sources of compressibility of these graphs and of the associated set of URLs in order to obtain good compression performance on a large Web graph
Keywords :
data compression; data structures; graph theory; information resources; network servers; search engines; DEC connectivity server; URL; World Wide Web; compression performance; graph structure compression; memory; prototype implementation; search engine performance; space-efficient data structures; Computational Intelligence Society; Data structures; Large-scale systems; Navigation; Prototypes; Search engines; Uniform resource locators; Web sites; Workstations;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Compression Conference, 2001. Proceedings. DCC 2001.
Conference_Location :
Snowbird, UT
ISSN :
1068-0314
Print_ISBN :
0-7695-1031-0
Type :
conf
DOI :
10.1109/DCC.2001.917152
Filename :
917152
Link To Document :
بازگشت