Title :
GPU-Based PostgreSQL Extensions for Scalable High-Throughput Pattern Matching
Author :
Scott, G. ; England, M. ; Melkowski, K. ; Fields, Z. ; Anderson, D.T.
Author_Institution :
Center for Geospatial Intell., Univ. of Missouri, Columbia, MO, USA
Abstract :
Numerous fields require large-scale pattern matching to achieve a variety of computational goals. Herein, we present novel graphics processing unit (GPU) extensions that facilitate high-throughput pattern matching in a PostgreSQL database. We have developed an extension framework to perform data block processing of large pattern data sets, using a stream processing design that results in global k-nearest neighbor matches. This framework was specifically designed to support pattern matching on GPU from within the database environment. This approach avoids the necessity of storing an entire data set onto GPU hardware, which facilitates significant scale-up of pattern databases. This provides enormous potential to incorporate or exploit auxiliary (meta)data as part of the pattern matching process, as well as pipelining the results into traditional relational algebra expressions. By pipelining pattern matching results into a relational expression, the power of the database can be leveraged to build result sets based on various parameterized correlations between the query pattern(s) and the results. In this preliminary work, we have integrated GPU-based high-throughput p-norm metric functions into the database server. This allows one to design heterogeneous data processing techniques that combine large-scale content-based image retrieval (CBIR) with traditional data processing capabilities of the database such as relational, spatial, or text search. We present timing characteristics for various pattern sizes and metric combinations, as well as address the balancing of database and GPU parameterization. Our feature vector datasets range from 18 to 85 GB in database table storage size, reaching 100 million 128 dimensional vectors. We are able to efficiently execute global top k searches from within the database.
Keywords :
SQL; content-based retrieval; graphics processing units; image matching; image retrieval; visual databases; CBIR; GPU hardware; GPU-based PostgreSQL extensions; GPU-based high-throughput p-norm metric functions; PostgreSQL database; data block processing; database server; database table storage size; global k-nearest neighbor matches; graphics processing unit; heterogeneous data processing techniques; large pattern data sets; large-scale content-based image retrieval; large-scale pattern matching; parameterized correlations; pattern databases; pipelining pattern matching; query pattern; relational algebra expressions; scalable high-throughput pattern matching; storage capacity 18 Gbit to 85 Gbit; stream processing design; Databases; Graphics processing units; Hardware; Kernel; Pattern matching; Timing; Vectors; Pattern matching; PostgreSQL; graphics processing unit (GPU); heterogeneous data; high-performance computing (HPC); high-throughput computing (HTC);
Conference_Titel :
Pattern Recognition (ICPR), 2014 22nd International Conference on
Conference_Location :
Stockholm
DOI :
10.1109/ICPR.2014.329