DocumentCode :
2941322
Title :
Random access from compressed datasets with perfect value hashing
Author :
Miller, John M.
Author_Institution :
Microsoft Corp., Redmond, WA, USA
fYear :
1995
fDate :
17-22 Sep 1995
Firstpage :
454
Abstract :
A representation technique is presented allowing for quick access of individual records from a static compressed dataset. Given a collection of key-record pairs, the representation allows the appropriate short record to be returned for any given key. The approach is a generalization of perfect address hashing. The new approach, called perfect value hashing, uses a carefully chosen pseudo-random number generator to directly produce the correct record for any key in the dataset. This contrasts with address hashing where the random number provides an address which is then used to recover the record from a separate table. Value hashing doesn´t have the theoretical limitations of address hashing, and in practice is more space efficient for records of size less than 36 bits. Value hashing has the added benefit (important when the records are encoded for compression) that variable length records can be represented without an increase in the size of the encoded records. This new technique was used to provide random access from a highly compressed spelling dictionary
Keywords :
data compression; data structures; encoding; random number generation; random processes; compressed datasets; compressed spelling dictionary; encoded records; individual records access; key-record pairs; perfect address hashing; perfect value hashing; pseudorandom number generator; random access; representation technique; variable length records; Books; Broadcasting; Cost function; Databases; Dictionaries;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Theory, 1995. Proceedings., 1995 IEEE International Symposium on
Conference_Location :
Whistler, BC
Print_ISBN :
0-7803-2453-6
Type :
conf
DOI :
10.1109/ISIT.1995.550441
Filename :
550441
Link To Document :
بازگشت