Welcome to Scribd!

Gpu Cuda Part1

Uploaded by

0% found this document useful (0 votes)

6 views27 pages

This document provides an introduction to GPUs and CUDA programming. It discusses the evolution of GPU microarchitectures from early graphics accelerators to modern GPUs with programmable hardware. GPUs have thousands of cores devoted to highly parallel computation and follow a single-program multiple-data processing model. The document outlines GPU terminology and components like streaming multiprocessors and execution cores. It also compares CPU and GPU architectures and provides examples of NVIDIA GPU microarchitectures from Fermi to Volta over several generations.

Original Description:

Original Title

Gpu Cuda Part1 (1)

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

6 views27 pages

Gpu Cuda Part1

Uploaded by

Raghav Ganesh

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 27

Search inside document

IT301: INTRODUCTION TO

CUDA
By,
Ms. Thanmayee
Adhoc Faculty,
Department of IT,
NITK, Surathkal
OUTLINE
● Introduction to GPU
● Evolution of GPU microarchitectures
● General Purpose GPU
● Introduction to CUDA
● CUDA Execution Model
● CUDA Memory Model
● Steps in GPU Execution
● Hello World Program
● CUDA Device Variables
● CUDA Programming examples
Let's learn about GPU..

● A little history:
− The first GPUs were designed as graphics accelerators
● supported only specific fixed-function pipelines.
− In the late 1990s, the hardware became increasingly programmable
● Culminating in NVIDIA's first GPU in 1999.
Graphics Processing Unit

● Has thousands of cores and ALUs.

● They can handle billions of repetitive low level tasks.

● GPU is specialized for compute-intensive, highly parallel computation.

● They are devoted to data processing rather than data caching and flow
control.

● Follows SPMD processing model.

CPU versus GPU
More closer look at GPU:
GPU Terms:
● Stream processing -- Term used to denote processing of a stream of
instructions operating in a data parallel fashion.
● Stream Processors (SPs) – the execution cores that will execute the
stream. Each stream processor has compute resources such as register
file, instruction scheduler
● Streaming multiprocessors (SMs) -- groups of streaming processors
that shares control logic and cache.
GPU Microarchitectures
Fermi Architecture (2010)

https://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf
What each SMs
have..

● Memory operations are handled by a set of 16 load-store units in

each SM.
● A set of four Special Function Units (SFUs) is also available to
handle transcendental and other special operations such as sin,
cos, exp, and rcp (reciprocal)
● Along with the group of 16 load-store units and the four SFUs,
there are four execution blocks per SM.
○ 16 + 16 Cores (2 blocks)
○ LD/ST
○ SFU
A total of 32 instructions from one or two warps can be dispatched in
each cycle to any two of the four execution blocks within a Fermi SM
Kepler (2012)
Fermi Versus
Kepler
Maxwell (2014)
Pascal (2016)
Fermi : GTX480
GTX580

Kepler : GTX 680

GTX 780
GTX 780 Ti

Maxwell: GTX 980

GTX 980 Ti

Pascal : GTX 1080

Source : https://www.techspot.com/article/1191-nvidia-geforce-six-generations-tested/
Volta micro architecture (2017)
NVIDIA
Tesla GPUs
THANK YOU

Sega Saturn Architecture: Architecture of Consoles: A Practical Analysis, #5
From Everand
Sega Saturn Architecture: Architecture of Consoles: A Practical Analysis, #5
Rodrigo Copetti
No ratings yet
Dreamcast Architecture: Architecture of Consoles: A Practical Analysis, #9
From Everand
Dreamcast Architecture: Architecture of Consoles: A Practical Analysis, #9
Rodrigo Copetti
No ratings yet
1 Cuda
Document173 pages
1 Cuda
Diego Canales Aguilera
No ratings yet
GPU Programming: Dr. Florian Ferreira
Document101 pages
GPU Programming: Dr. Florian Ferreira
Slal Opza
No ratings yet
2021 08 26 High Performance GPU Tensor CoreCode Generation For Matmul Using MLIR
Document57 pages
2021 08 26 High Performance GPU Tensor CoreCode Generation For Matmul Using MLIR
aniwats
No ratings yet
Introduction To GP-GPU and CUDA: High Performance Computing Center Hanoi University of Science & Technology
Document43 pages
Introduction To GP-GPU and CUDA: High Performance Computing Center Hanoi University of Science & Technology
Mato Nguyễn
No ratings yet
Lecture 0: Cpus and Gpus: Prof. Mike Giles
Document36 pages
Lecture 0: Cpus and Gpus: Prof. Mike Giles
Aashish
No ratings yet
Gpu Programming
Document96 pages
Gpu Programming
Jino Goju Stark
100% (2)
Architectural Details of Tesla GPU Microarchitecture
Document9 pages
Architectural Details of Tesla GPU Microarchitecture
shreya mehta
No ratings yet
Programming Gpus With Cuda: John Mellor-Crummey
Document42 pages
Programming Gpus With Cuda: John Mellor-Crummey
askbilladdmicrosoft
No ratings yet
Lecture 1: An Introduction To CUDA: Mike Giles
Document247 pages
Lecture 1: An Introduction To CUDA: Mike Giles
Stanislav Spatari
No ratings yet
Lecture 1: An Introduction To CUDA: Mike Giles
Document40 pages
Lecture 1: An Introduction To CUDA: Mike Giles
sdancer75
No ratings yet
L 3 GPU
Document33 pages
L 3 GPU
fdfs
No ratings yet
SaM2 2008
Document6 pages
SaM2 2008
Umer Ali
No ratings yet
Gpu Cuda Part2
Document15 pages
Gpu Cuda Part2
Raghav Ganesh
No ratings yet
Lec 1
Document27 pages
Lec 1
foof faaf
No ratings yet
Gpu1 - GPU Introduction
Document20 pages
Gpu1 - GPU Introduction
Richik Dutta
No ratings yet
Introduction To Gpu Programming With Cuda and Openacc
Document40 pages
Introduction To Gpu Programming With Cuda and Openacc
plop
No ratings yet
Lecture - 01 - CUDA Programming
Document52 pages
Lecture - 01 - CUDA Programming
Denis
No ratings yet
Chap6 Heter Computing
Document22 pages
Chap6 Heter Computing
Michael Shi
No ratings yet
Gpgpu Final
Document124 pages
Gpgpu Final
Sibghat Rehman
No ratings yet
CUDA Execution Model
Document67 pages
CUDA Execution Model
AbiMughal
No ratings yet
GPU Introduction
Document52 pages
GPU Introduction
spark1122
No ratings yet
Parallel Processing With Cuda
Document25 pages
Parallel Processing With Cuda
Sudip Adhikari
No ratings yet
GPU in Supercomputer
Document7 pages
GPU in Supercomputer
Chinmai Panibathe
No ratings yet
CUDA Programming On Nvidia Gpus: Mike Giles
Document21 pages
CUDA Programming On Nvidia Gpus: Mike Giles
proxymo1
No ratings yet
Side-Channel Power Analysis of A GPU AES Implementation: Chao Luo, Yunsi Fei, Pei Luo, Saoni Mukherjee, David Kaeli
Document8 pages
Side-Channel Power Analysis of A GPU AES Implementation: Chao Luo, Yunsi Fei, Pei Luo, Saoni Mukherjee, David Kaeli
Anupam Das
No ratings yet
Accelerating Large Graph Algorithms On The GPU Using CUDA
Document12 pages
Accelerating Large Graph Algorithms On The GPU Using CUDA
Maks Mržek
No ratings yet
TFM - Unfinished
Document17 pages
TFM - Unfinished
obssidian
No ratings yet
GPU Basics
Document93 pages
GPU Basics
Maheshkumar Amula
No ratings yet
NNSW Tools
Document39 pages
NNSW Tools
Fernando Cisneros
No ratings yet
Data-Level Parallelism in Vector, SIMD, And: GPU Architectures
Document29 pages
Data-Level Parallelism in Vector, SIMD, And: GPU Architectures
Marcelo Araujo
No ratings yet
2 - Introduction To The GPU
Document3 pages
2 - Introduction To The GPU
olia.92
No ratings yet
Nvidia Cuda
Document26 pages
Nvidia Cuda
Arpit Vijayvergia
No ratings yet
GPUProgramming Talk
Document18 pages
GPUProgramming Talk
Ramu
No ratings yet
GPUs From NVIDIA Have Many Processors
Document1 page
GPUs From NVIDIA Have Many Processors
Shruti Arora
No ratings yet
Knowledge Management Notetaking Community Websites Intranets Ward Cunningham Wikiwikiweb Wiki ( Wiki) Hawaiian
Document2 pages
Knowledge Management Notetaking Community Websites Intranets Ward Cunningham Wikiwikiweb Wiki ( Wiki) Hawaiian
meridas
No ratings yet
1 Introduction
Document45 pages
1 Introduction
R
No ratings yet
CPUs GPUs Accelerators
Document22 pages
CPUs GPUs Accelerators
Kevin William Daniels
No ratings yet
Why GPU?: CS8803SC Software and Hardware Cooperative Computing
Document14 pages
Why GPU?: CS8803SC Software and Hardware Cooperative Computing
Sohei La
No ratings yet
From CPU To GPU With CUDA C Language: Michele Tuttafesta Dottorato Di Ricerca in Fisica 25 Ciclo
Document71 pages
From CPU To GPU With CUDA C Language: Michele Tuttafesta Dottorato Di Ricerca in Fisica 25 Ciclo
Andreina Dell'olio
No ratings yet
A Brief Overview of The Graphics Pipeline: Cedric Lee
Document33 pages
A Brief Overview of The Graphics Pipeline: Cedric Lee
jkonduru
No ratings yet
Product Availability Update: Processamento Paralelo em GPU's Na Arquitetura Fermi
Document44 pages
Product Availability Update: Processamento Paralelo em GPU's Na Arquitetura Fermi
Vinícius Honda
100% (1)
4 - Key Concepts
Document2 pages
4 - Key Concepts
olia.92
No ratings yet
GCN Architecture Whitepaper
Document18 pages
GCN Architecture Whitepaper
Илија Петровић
No ratings yet
Multiprocessors and Linux: Krzysztof Lichota Lichota@mimuw - Edu.pl
Document30 pages
Multiprocessors and Linux: Krzysztof Lichota Lichota@mimuw - Edu.pl
vinayaka29
No ratings yet
Teslapersonalsupercomputer 160201192005
Document16 pages
Teslapersonalsupercomputer 160201192005
Naveen kumar
No ratings yet
CUDA Wikipedia
Document10 pages
CUDA Wikipedia
Muktikanta Sahu
No ratings yet
NVIDIA GPU Computing - A Journey From PC Gaming To Deep Learning
Document91 pages
NVIDIA GPU Computing - A Journey From PC Gaming To Deep Learning
Aldo Chella
No ratings yet
CST GPU Flyer PDF
Document2 pages
CST GPU Flyer PDF
NIELS DIMAS MELGAR BUENDIA
No ratings yet
Parallel & Distributed Computing Report
Document4 pages
Parallel & Distributed Computing Report
Umer Ali
No ratings yet
Cuda PDF
Document18 pages
Cuda PDF
Xemen
No ratings yet
Elc2009 Qemu Cris
Document43 pages
Elc2009 Qemu Cris
ma haijun
No ratings yet
Whitepaper NVIDIA's Next Generation CUDA Compute Architecture
Document21 pages
Whitepaper NVIDIA's Next Generation CUDA Compute Architecture
Centrale3D
No ratings yet
Accelerating Large Graph Algorithms On The GPU Using Cuda
Document12 pages
Accelerating Large Graph Algorithms On The GPU Using Cuda
abc010
No ratings yet
GPU Cluster4
Document31 pages
GPU Cluster4
Ismar Santos
No ratings yet
Dept. of Electrical Engineering COMSATS University Islamabad Fall 2018
Document26 pages
Dept. of Electrical Engineering COMSATS University Islamabad Fall 2018
Muhammad shahzaib
No ratings yet
CUDA Cuts Fast Graph Cuts On The GPU
Document8 pages
CUDA Cuts Fast Graph Cuts On The GPU
as.dcdvfd
No ratings yet
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
Franco Mario
No ratings yet
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
From Everand
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
Rodrigo Copetti
No ratings yet
ME352 Lecture 4 Cams
Document43 pages
ME352 Lecture 4 Cams
Raghav Ganesh
No ratings yet
Kinematic and Dynamic Analysis of Cam and Follower
Document12 pages
Kinematic and Dynamic Analysis of Cam and Follower
Raghav Ganesh
No ratings yet
Gpu Cuda Part2
Document15 pages
Gpu Cuda Part2
Raghav Ganesh
No ratings yet
It301: Introduction To Cuda: By, Ms. Thanmayee Adhoc Faculty, Department of IT, NITK, Surathkal
Document13 pages
It301: Introduction To Cuda: By, Ms. Thanmayee Adhoc Faculty, Department of IT, NITK, Surathkal
Raghav Ganesh
No ratings yet
Cam Kinematics - Cam Profile
Document12 pages
Cam Kinematics - Cam Profile
Raghav Ganesh
No ratings yet
Academic Calendar Jan July 2020 Revised
Document2 pages
Academic Calendar Jan July 2020 Revised
Raghav Ganesh
No ratings yet
Details of Minor Course For 2019 Batch
Document9 pages
Details of Minor Course For 2019 Batch
Raghav Ganesh
No ratings yet
Single and Three Phase Inverters: World Class Power Solutions
Document8 pages
Single and Three Phase Inverters: World Class Power Solutions
Michael Besa
No ratings yet
Dell 13321-1
Document104 pages
Dell 13321-1
Carlos Alberto Miranda Perez
No ratings yet
Z490 Gaming X Ax Z490 Gaming X: User's Manual
Document52 pages
Z490 Gaming X Ax Z490 Gaming X: User's Manual
HeberDiaz
No ratings yet
5.4 UAIS DEBEG 3400, Setting-to-Work and Configuration: Section 5.4.1 5.4.2 5.4.3
Document4 pages
5.4 UAIS DEBEG 3400, Setting-to-Work and Configuration: Section 5.4.1 5.4.2 5.4.3
Balasaheb Adhalrao
No ratings yet
INVENTEC KRUG (KRUG 14,15 - UMA) WS BUILD 6050A2296601 REV X01 - Dell Latitude E5420 PDF
Document106 pages
INVENTEC KRUG (KRUG 14,15 - UMA) WS BUILD 6050A2296601 REV X01 - Dell Latitude E5420 PDF
Bala Murugan
No ratings yet
TravelMate 8372 Series Inventec BAP30 - BXP30
Document41 pages
TravelMate 8372 Series Inventec BAP30 - BXP30
Micropc Technology Solution
No ratings yet
Toshiba INVENTEC DAKAR10ABX CS Dk10ABXG 6050A2509701 A01 Schematic Diagram
Document57 pages
Toshiba INVENTEC DAKAR10ABX CS Dk10ABXG 6050A2509701 A01 Schematic Diagram
Service Info
No ratings yet
Category ID Product Quantity Unit Price Total
Document2 pages
Category ID Product Quantity Unit Price Total
Mohammad Ruman
No ratings yet
Steganography: Watermarking by
Document38 pages
Steganography: Watermarking by
luong tran
No ratings yet
Pantalla DMG12800T070 40WTC Datasheet1
Document6 pages
Pantalla DMG12800T070 40WTC Datasheet1
EAKIT SPAIN
No ratings yet
Single Mode Family: Data Sheet 09/97
Document12 pages
Single Mode Family: Data Sheet 09/97
Сергей Колосов
No ratings yet
Effect of Die Bonding On The Performance of High Power Semiconductor Laser Diode
Document4 pages
Effect of Die Bonding On The Performance of High Power Semiconductor Laser Diode
cjpanchal
No ratings yet
Pdf&rendition 1
Document4 pages
Pdf&rendition 1
Rakesh Sharma
No ratings yet
CCNP Switch, Ch.6 NOTES
Document5 pages
CCNP Switch, Ch.6 NOTES
Alexander Solano De la cruz
No ratings yet
Esktop: Pulse Oximeter
Document3 pages
Esktop: Pulse Oximeter
Brevas Cucho
No ratings yet
CF Program No 1, 8
Document16 pages
CF Program No 1, 8
Kuldeep Batra
No ratings yet
Inverter Motor Trainer: - Mechatronics
Document1 page
Inverter Motor Trainer: - Mechatronics
atif
No ratings yet
Fractal Design
Document4 pages
Fractal Design
Crash Zix
No ratings yet
® Low-Loss Foam-Dielectric Coaxial Cable: 1/2" Cellflex
Document2 pages
® Low-Loss Foam-Dielectric Coaxial Cable: 1/2" Cellflex
Mateus Lucas de Campos e Silva
No ratings yet
H5 Gantry Theory
Document52 pages
H5 Gantry Theory
Emmanuel Virtudazo
No ratings yet
EE-222 Microprocessor Systems: DE-41 (EE) Syn C
Document112 pages
EE-222 Microprocessor Systems: DE-41 (EE) Syn C
Osama Ikram
No ratings yet
MR-1CP-NWIM Student Guide
Document502 pages
MR-1CP-NWIM Student Guide
Akram Khan
100% (1)
Activity, Class, Sequence, Collaboration, Deployment Diagram
Document5 pages
Activity, Class, Sequence, Collaboration, Deployment Diagram
Patrick D Cerna
No ratings yet
Ex 602
Document2 pages
Ex 602
dinesh pratapsingh
No ratings yet
Open Silicon - Factiva
Document46 pages
Open Silicon - Factiva
Abdelwahed
No ratings yet
NFS-320C: Intelligent Addressable Fire Alarm System
Document8 pages
NFS-320C: Intelligent Addressable Fire Alarm System
eduardo gonzalezav
No ratings yet
SB1620CT - SB16100CT: 16A Dual Schottky Barrier Rectifier
Document4 pages
SB1620CT - SB16100CT: 16A Dual Schottky Barrier Rectifier
Jean Gonzalez
No ratings yet
2023 11 15 18 44 49 ORDENADOR-PC Log
Document404 pages
2023 11 15 18 44 49 ORDENADOR-PC Log
Triust XD
No ratings yet
Celsius Trobleshooting
Document32 pages
Celsius Trobleshooting
Eduardo Saul Mendoza
No ratings yet
Dokumen - Tips - Rectifier Hariff Apr 3g Sys 48 Volt Tampilan Menu
Document16 pages
Dokumen - Tips - Rectifier Hariff Apr 3g Sys 48 Volt Tampilan Menu
Donking Thotosik
100% (1)