0% found this document useful (0 votes)
56 views4 pages

CUDA Basics for Engineering Students

crud
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views4 pages

CUDA Basics for Engineering Students

crud
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Faculty of Engineering & Technology

High Performance Computing Laboratory


(203105430)
B. Tech CSE 4th Year 7th Semester

PRACTICAL : 09

AIM: Write a simple CUDA program to print “Hello World!”

 What is CUDA programing ?

CUDA (Compute Unified Device Architecture) programming is a parallel


computing platform and application programming interface (API) model
created by NVIDIA. It allows developers to harness the computational
power of NVIDIA GPUs (Graphics Processing Units) for general-purpose
processing, beyond just graphics rendering.

 Logical architecture of GPU

1. Grids :
Grid refers to the highest-level grouping of threads that are scheduled
for execution on the GPU device. It represents the entire set of parallel
work that needs to be processed by the GPU.

Enrollment No.: 210303105048


Div: 7B25
P a g e | 38
Faculty of Engineering & Technology
High Performance Computing Laboratory
(203105430)
B. Tech CSE 4th Year 7th Semester

2. Blocks :
A block is a group of threads that execute concurrently on an SM.
Threads within the same block can cooperate with each other through
shared memory and synchronization mechanisms.

3. Warps :
A warp is the smallest unit of execution in CUDA. It consists of 32
consecutive threads that are executed in lockstep on an SM. This
means that all 32 threads within a warp execute the same instruction at
the same time.

4. Threads :
A thread is a basic unit of execution in CUDA (NVIDIA's parallel
computing platform). Threads are organized into groups called thread
blocks, and multiple thread blocks are organized into a grid.

 CUDA program execution flow.

Enrollment No.: 210303105048


Div: 7B25
P a g e | 39
Faculty of Engineering & Technology
High Performance Computing Laboratory
(203105430)
B. Tech CSE 4th Year 7th Semester

Steps :

1. Data copy from CPU to GPU.


2. Execution on GPU.
3. Data copy from GPU to CPU.

 CUDA program to print hello world.

%%writefile p1.cu

#include <stdio.h>

__global__ void cuda_hello() {


printf("Hello World!\n");
}

int main() {
cuda_hello<<<1,5>>>();
cudaDeviceSynchronize();
return 0;
}

 Output.

Enrollment No.: 210303105048


Div: 7B25
P a g e | 40
Faculty of Engineering & Technology
High Performance Computing Laboratory
(203105430)
B. Tech CSE 4th Year 7th Semester

 “Hello world” from diff numbers of blocks.

%%writefile p2.cu

#include <stdio.h>

__global__ void cuda_hello() {


printf("Good morning PU\n");
}

int main() {
cuda_hello<<<2,5>>>();
cudaDeviceSynchronize();
return 0;
}

 Output.

Enrollment No.: 210303105048


Div: 7B25
P a g e | 41

You might also like