DocumentCode :
2936755
Title :
An Efficient Approach for Building Compressed Full-Text Index for Structured Data
Author :
Liang, Jun ; Xiao, Lin ; Zhang, Di
Author_Institution :
Training Center of Electron. Inf., Beijing Union Univ., Beijing, China
fYear :
2009
fDate :
24-26 Nov. 2009
Firstpage :
59
Lastpage :
63
Abstract :
The self-index is a kind of highly compressed, self-contained full-text index. It is designed for indexing plain texts in order to reduce its permanent storage, as well as to enhance searching performance. Apart from being a sequence of characters, usually the text has specific internal structure. The data record, as a basic model of structured data, is therefore employed to represent and organize such form of data widespread. In this paper, we design and implement an approach to building the self-index for data records via text medium. Our approach indexes the data records through an intermediate text which accommodates aligned record fields by stuffing delimiters among them. By theoretical analysis, we give the upper bounds of permanent space of our approach in a worst case. In addition, we report a series of experimental results to validate the correctness and efficiency of the proposed approach.
Keywords :
data structures; indexing; text analysis; compressed full-text index; data records; indexing; plain texts; self-contained full-text index; self-index; structured data; Buildings; Data analysis; Educational institutions; Entropy; Indexing; Information technology; Performance analysis; Performance evaluation; Relational databases; Upper bound; compressed full-text index; data compression; self-index; structured data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Sciences and Convergence Information Technology, 2009. ICCIT '09. Fourth International Conference on
Conference_Location :
Seoul
Print_ISBN :
978-1-4244-5244-6
Electronic_ISBN :
978-0-7695-3896-9
Type :
conf
DOI :
10.1109/ICCIT.2009.42
Filename :
5370552
Link To Document :
بازگشت