You are on page 1of 30

Teknik Antarmuka dan Periferal

Sesi 7
TEI302
What is a GPU?
• It is a GPU (Graphical Processing Unit)
• First introduced in 1999
• First non-graphical applications started 2003
• Had many problems for non-graphical applications
– ie. Programmer needed knowledge of API and
architecture
• Rendering is image generating from a model using computer
program
• Image model is a 3D object in a structured data and language
which contains geometric, viewpoint, texture, lighting and
shading information
Texture Mapping
• A layering process for a 3D image using specific surface
• 3D image result will look real and lively
• Fundamentally, texture consists of 2D images at specific
pattern
• This texture encloses the 3D object to form a new object
Texture Mapping Example
Shader
• A program to define the property of a 3D surface
– Light absorption
– Diffusion
– Reflection
– Shading effect
– Uses ‘shading language’
Rendering
• Final process to display a completed 3D object including all the properties,
i.e. lighting effect, to produce realistic object
• Common use to perform rendering in video gaming, simulator, movie,
special effect, architecture design visualization
Graphic Card

• An add-on card in a computer to produce pixels for displaying


the images as the product of CPU computation

GPU

Video memory

Cooler (Heatsink+Fan)

Interface : ISA/PCI/PCIX/AGP/PCIe
GPU : Graphics Pipeline

Graphics State

Screenspace triangles (2D)


Xformed, Lit Vertices (2D)

Final Pixels (Color, Depth)


Fragments (pre-pixels)
Vertices (3D)

Transform Assemble Video


Application Rasterize Shade
& Light Primitives Memory
(Textures)

CPU GPU Render-to-texture


The Development of GPU: Modern
Graphics Pipeline
Graphics State

Screenspace triangles (2D)


Xformed, Lit Vertices (2D)

Final Pixels (Color, Depth)


Fragments (pre-pixels)
Vertices (3D)

Vertex
Transform Assemble Fragment
Application Rasterize Shade Video
Processor
& Light Primitives Processor Memory
(Textures)

CPU GPU Render-to-texture

• Programmable vertex • Programmable pixel


processor! processor!
Advancement of GPU : Modern
Graphics Pipeline
Graphics State

Screenspace triangles (2D)


Xformed, Lit Vertices (2D)

Final Pixels (Color, Depth)


Fragments (pre-pixels)
Vertices (3D)

Vertex Geometry
Assemble Fragment
Application Rasterize Video
Processor Processor
Primitives Processor Memory
(Textures)

CPU GPU Render-to-texture

 Programmable primitive  More flexible memory


assembly! access!
GPU : Computation Power
GPU: Computation Capacity
Why keeps getting faster?
• Naturally the GPU is easier to add transistor for computation
purposes
• The economics: graphic business market, especially in video gaming
is huge

Graphics Characteristics
• Requires massive computation
• Highly parallelism
• Graphics pipeline is designed to reduce interdependent
operations
• GPUs accommodate massive parallelism by having ALUs to process
arithmetic and overcome the issue of data streaming due to huge
quantity of data
General Purpose CPU
GPU
GPU Structure
GPU Structure
GPU vs CPU
GPGPU (GPU Computing)
• Applications
• Molecular dynamics
• Electromagnetic and acoustic waves
• Computer vision
• Computational statistics
• Computational finance
• Bioinformatics
CUDA
• CUDA - Compute Unified Device Architecture
• Changed the architecture to better suit general programming
• CUDA is a software and hardware architecture
• Supports C Programming
• Replaced pixel and vertex pipelines with a single pipeline
• Added SIMT (single instruction multiple thread)
• Still programmers asked for more
Tesselation GPU
Datasets of polygons (called vertex sets) management to
present objects in a scene and divide them into suitable
structures for rendering. Usually in the form of triangles.
Next Gen Code: “Fermi”
• Improved double precision performance
• ECC Support (error correction code)
• True Cache Hierarchy
• More Shared Memory
• Faster Context Switching
• Faster Atomic Operations
“Fermi” Architecture
• 16 SMs (streaming multiprocessors)
• Each SM contains 32 CUDA cores
• Totaling 512 CUDA cores
• Has 64-bit memory partitions
• Supports up to 6 GB GDDR5 DRAM
• Connects to a host CPU via PCI Express
Programming Model
• Parallel functions are done on the GPU
• Non parallel code is done on the host CPU
• Parallel functions are organized in threads/thread
blocks/arrays of thread blocks
• Threads each have it’s own private memory
• Thread blocks share memory with each thread in it’s block
• Arrays of blocks share memory for application
Threads and SMs
Development Tools
Programming environments
• NVIDIA CUDA C
• CUDA Fortran Compiler
• AMD Stream
• BrookGPU / Brook+
• RapidMind Platform
• OpenCL (Apple) : INTEL
• DirectCompute – Supported by DirectX 10 and DirectX 11
• NVIDIA Nexus or NSight
OpenCL
• Open Computing Language (OpenCL) is a framework for
writing programs that execute across heterogeneous
platforms consisting of central processing units (CPUs),
graphics processing units (GPUs), digital signal processors
(DSPs), field-programmable gate arrays (FPGAs) and other
processors or hardware accelerators.
• OpenCL specifies programming languages (based on C99 and
C++11) for programming these devices and application
programming interfaces (APIs) to control the platform and
execute programs on the compute devices.
• OpenCL provides a standard interface for parallel computing
using task- and data-based parallelism.
Further Information
• Intel has now started pounding the • Commercial 3D software
marketing drums on something long – Maya, V-Ray for Maya, 3ds Max, V-Ray
for 3ds Max, C4D, V-Ray for C4D,
predicted: integration of Intel’s graphics Lightwave, XSI, V- Ray Standalone,
onto the same die as its next generation Modo, Cinema 4D and Brazil
“Sandy Bridge” processor chip, due out in • Commercially Cloud-based
mid-2011. Renderfarm
• Probably not coincidentally, mid-2011 is – Many still ‘traditional’ or non GPU
when AMD’s Llano processor will see machines
daylight. It incorporates enough graphics- – Extra Large (XL-CPU), 8 virtual cores (20
related processing to be an apparently EC2 CUs), 7 GB RAM, 50 render nodes,
200 hours = $9742.42 (beta users =
decent DX11 GPU, the architecture hasn’t $6843.03)
been disclosed in detail. – Discount for >= 60 hours: 30 %
• Animated movies (Wall-E Pixar) – $0.08 per core hour
– $180 million • Built-your-own High Performance
• DreamWorks Animation Shrek 3 (2007) Computer (HPC)
– 1000 Linux Desktops, 3000 CPUs, 20 – 2 GFlops for $500
million render hours • Anyone interested in becoming GPU
• Blender - opensource rendering software specialist programmer? Or Computer
Graphics (CG) artist?

You might also like