DocumentCode
259116
Title
Efficient data transfer scheme using word-pair-encoding-based compression for large-scale text-data processing
Author
Waidyasooriya, Hasitha Muthumala ; Ono, Daisuke ; Hariyama, Masanori ; Kameyama, Michitaka
Author_Institution
Grad. Sch. of Inf. Sci., Tohoku Univ., Sendai, Japan
fYear
2014
fDate
17-20 Nov. 2014
Firstpage
639
Lastpage
642
Abstract
Large-scale data processing is very common in many fields such as data-mining, genome mapping, etc. To accelerate such processing, Graphic Accelerator Units (GPU) and FPGAs (Feild-Programmable Gate-Array) are used. However, the large data transfer time between the accelerator and the host computer is a huge performance bottleneck. In this paper, we use a word-pair-encoding method to compress the data down to 25% of its original size. The encoded data can be decoded from any position without decoding the whole data file. For some algorithms, the encoded data can be processed without decoding. Using Burrows-Wheeler algorithm based text search, we show that the data amount and transfer time can be reduced by over 70%.
Keywords
data compression; data mining; encoding; field programmable gate arrays; graphics processing units; text analysis; Burrows- Wheeler algorithm based text search; FPGA; GPU; data transfer scheme; data-mining; encoded data; field-programmable gate-array; genome mapping; graphic accelerator units; large-scale text-data processing; performance bottleneck; word-pair-encoding-based compression; Arrays; Bioinformatics; Data compression; Data transfer; Encoding; Genomics; Graphics processing units; Succinct data structures; big data; data compression;
fLanguage
English
Publisher
ieee
Conference_Titel
Circuits and Systems (APCCAS), 2014 IEEE Asia Pacific Conference on
Conference_Location
Ishigaki
Type
conf
DOI
10.1109/APCCAS.2014.7032862
Filename
7032862
Link To Document