With these irregular tasks in mind, we propose to use different granularities
for our work assignment and processing. We use a per-thread work assignment followed by a per-warp processing; each thread is still responsible for an independent task, but now all threads within a warp cooperate with each other to perform all these tasks together, one at a time, until all are successfully processed. Using this strategy, we design a dynamic hash table for the GPU, the slab hash, which is a totally concurrent data structure supporting asynchronous updates and search queries: threads may have different operations to perform and each might require an unknown amount of time to be fulfilled. By following our warp-cooperative strategy, all threads help each other perform these operations together, causing a much higher warp efficiency compared to traditional conflated work assignment and processing schemes.