You are on page 1of 1

Chapter 2

Parallel Approaches to the String


1
Matching Problem on the GPU

2.1 Introduction
Classic PRAM algorithms are proving to be a fertile source of ideas for GPU programs solving
fundamental computational problems. But in practice, only certain PRAM algorithms actually
perform well on GPUs. The PRAM model and the GPU programming model differ in important
ways. Particularly important considerations suggest choosing algorithms where parallel threads
perform uniform computation without branching or waiting (“uniformity”); where memory
accesses across neighboring threads access neighboring memory locations (“coalescing”); and
where the algorithm and memory accesses can take advantage of the computational and memory
hierarchy of the modern GPU (“hierarchy”). Although these choices are usually the foremost
considerations in GPU implementations, as in Schatz and Trapnell [86] and Lin et al. [62],
they rarely play an important role in the process of algorithm design. In other words, they are
considered optimization opportunities for programmers rather than desirable features of the
algorithm itself.
In this chapter we address this issue through a case study. We consider the exact string
matching problem, which is interesting in that it admits several different styles of parallelization.
We focus on a classical PRAM algorithm of Karp and Rabin [49]. Forty years of research
1
This chapter substantially appeared as “Parallel Approaches to the String Matching Problem on the GPU”
published at SPAA 2016 [5], for which I was the first author and responsible for most of the research and writing.

12

You might also like