Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Look up keyword or section
Like this
1Activity

Table Of Contents

What Is This Document?
Who Should Read This Guide?
1.1 Differences Between Host and Device
1.2 What Runs on an OpenCL-Enabled Device?
1.3 Maximum Performance Benefit
2.1.1 Using CPU Timers
2.1.2 Using OpenCL GPU Timers
2.2.1 Theoretical Bandwidth Calculation
2.2.2 Effective Bandwidth Calculation
3.1 Data Transfer Between Host and Device
3.1.1 Pinned Memory
3.2.1.1 A Simple Access Pattern
3.2.1.2 A Sequential but Misaligned Access Pattern
3.2.1.3 Effects of Misaligned Accesses
3.2.1.4 Strided Accesses
3.2.2.1 Shared Memory and Memory Banks
3.2.2.2 Shared Memory in Matrix Multiplication (C = AB)
3.2.2.3 Shared Memory in Matrix Multiplication (C = AAT
3.2.2.4 Shared Memory Use by Kernel Arguments
3.2.3 Local Memory
3.2.4.1 Textured Fetch vs. Global Memory Read
3.2.6.1 Register Pressure
4.2 Calculating Occupancy
4.3 Hiding Register Dependencies
4.4 Thread and Block Heuristics
4.5 Effects of Shared Memory
5.1 Arithmetic Instructions
5.1.1 Division and Modulo Operations
5.1.2 Reciprocal Square Root
5.1.4 Math Libraries
5.2 Memory Instructions
6.2 Branch Predication
A.2 High-Priority Recommendations
A.3 Medium-Priority Recommendations
A.4 Low-Priority Recommendations
0 of .
Results for:
No results containing your search query
P. 1
NVIDIA_OpenCL_BestPracticesGuide

NVIDIA_OpenCL_BestPracticesGuide

Ratings: (0)|Views: 550|Likes:
Published by targezzedd

More info:

Published by: targezzedd on Jun 27, 2011
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

06/27/2011

pdf

text

original

You're Reading a Free Preview
Pages 5 to 49 are not shown in this preview.

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->