You are on page 1of 15

GPU Accelerated Databases

Subtitle:
Database Driven OpenCL Programming

by
Tim Child

MRSC 2011
Speaker Biography
• Tim Child
• 35 years experience of software development
• Formerly
• VP Oracle Corporation
• VP BEA Systems Inc.
• VP Informix
• Leader at Illustra, Autodesk, Navteq, Intuit, …
• 30+ years experience in 3D, CAD, GIS and DBMS

MRSC 2011
Outline
• Speakers Biography
• Outline
• OpenCL Programming Challenge
• Review of GPU Accelerated Databases
• Swiss Army Knife of Data
• OpenCL Bindings to PostgreSQL
• Challenges
• Example Use Cases
• Benefits of the Approach
• Q&A
MRSC 2011
OpenCL Programming Problem
• Write a OpenCL Applications that :-
– Reads data from DBMS or File
– Publishes Results as Web Pages
– Handles Frequent Data Updates
– Data Size >> System RAM >> GPU RAM

• Possible Solutions
or
– C/C++ Binding using Web CGI
Database Driven
– Java/Perl/Python Bindings in App Server
GPU
– ???
Programming
MRSC 2011
GPU Co-Process
Architecture
GPGPU
TCP/IP IPC / RPC PCI Bus DRAM

DBMS GPU
DBMS
Client Language
Server
Co-Process

Examples
• 2004 Bandi, Sun, et al Data GPGPU
• Many others Tables

MRSC 2011
GPU Hosted Data
Architecture
GPGPU
DRAM
PCI Bus
TCP/IP
DBMS DBMS Sever Data
Client + GPU Host Tables
Copy

GPGPU

Examples Data
• 2008 Bakkum, Skardon Tables
• 2010 Palo OLAP
• 2010 ParStream

MRSC 2011
PostgeSQL
Swiss Army Knife of Data
SQL
(Declarative Language,
Extensible Set Operations) Extensible
Types Procedural
Languages
(Java, Perl, …)

Rules Extensible
System Indices

Open Vibrant
Source Community

Native
Remote PostGIS OpenCL
API’s
Data Access (Vector, Raster)
MRSC 2011
Procedural Language
Architecture
GPGPU
DRAM
TCP/IP PCI Bus
DBMS DBMS GPGPU
Client Server Host

Results Queries

GPGPU
10G B RAM Cache

Examples
• 1995 Illustra/Intel
• 2010 3DMashUp
Data 10T B
Tables
MRSC 2011
Comparison
Table
Co-Process GPU Hosted Procedural

Functionality Spatial Joins OLAP Generic

Queries Specific Select SQL


Queries statements

Data Types SQL SQL SQL + OpenCL

Performance IPC/RPC Direct Direct

Scalability Large Limited Large

Data Transfer Per Query ETL Per Query


database

MRSC 2011
OpenCL SQL Binding
CREATE or REPLACE FUNCTION VectorAdd(IN Id int[], IN a real[], IN B real[], OUT C
real[] )
AS $BODY$

__kernel void VectorAdd( __global int * id,


__global float *a,
__global float *b,
__global float *c)
{
int i = get_global_id(0); /* Query OpenCL for the Array Subscript **/

c[i] = a[i] + b[i];


}
$BODY$
Language PgOpenCL;

Select VectorAadd(Id, a, c) from Vectors;

MRSC 2011
Table
Database Driven OpenCL
A
B

Select Table 100’s - 1000’s of


to Array Threads (Kernels)

xPU
VectorAdd(A, B)
A + B Returns C = C

Copy Unnest Array


Copy To Table
Table

C C C C C C C C C C C C C
MRSC 2011
Challenges
• Type Mapping • Problem Size
– Extend SQL Types with – DBMS Table Size >> GPU RAM
√ • OpenCL Vectors Types – # Work Groups / # Work Items
√ • Runtime Partitioning
√ • OpenCL Image Types
• Setup –> Runtime • Device Management
– CPU vs. GPU
√ • Caching kernel info √ • Runtime Selection
• Data Transfer
– CPU ↔ GPU • Concurrency
= • Still present – No Pre-emptive Multi-Tasking
– SQL Queries X • Time-out Long Queries
X • + ∆ Overhead ( < 4µs ) X • Partitioning / Scheduling
X • Map – Array
– Bulk Data Loaders
X • New Task

MRSC 2011
Some Example Uses Cases
• 3D Content Management / GIS
– Spatial Selections
– Coordinate Transformations
– Image Processing
• Bioinformatics
– DNA & Protein Sequence Matching
• Database Internal Operations
– Joins
– Sorting
– Query Planning

MRSC 2011
Benefits of the Approach
PostgreSQL

OpenCL

Database
Open Internal
Source Operations
Release
MRSC 2011
Questions?
More Information
• PgOpenCL
• Twitter @3DMashUp
• Blog www.scribd.com/3dmashup

• Paper at AMD Fusion Summit, Jun 13 – 16, Seattle, USA

• OpenCL

• www.khronos.org/opencl/

• www.amd.com/us/products/technologies/stream-technology/opencl/

• http://software.intel.com/en-us/articles/intel-opencl-sdk

• http://www.nvidia.com/object/cuda_opencl_new.html

MRSC 2011

You might also like