You are on page 1of 27

NUST COLLEGE

OF ELECTRICAL
AND MECHANICAL ENGINEERING
Computer Organization
Assignment – 2

GRAPHICS PROCESSING
UNIT

Submitted by:
Warda Ahmed
NS – 7800
D-CE-37
Syndicate B

Computer Organization

Assignment - 2

Contents
Table of Figures........................................................................................ 2
List of Acronyms....................................................................................... 2
Abstract................................................................................................... 3
Processors............................................................................................... 4
Graphics Processing Unit (GPU).................................................................4
I. What is a GPU?......................................................................................4
II.Uses.....................................................................................................5
III....................................................................................................... GPU Manufacturers
5
IV......................................................................................................... Evolution of GPUS
5
1.
2.
3.
4.

Video Shifters (1970-1998).................................................................................................... 6
1980’s.................................................................................................................................... 7
1990s..................................................................................................................................... 8
The First GPU (1999).............................................................................................................. 9

V. Features...............................................................................................9
1. Memory Features.................................................................................................................. 10

VI.......................................................................................................... GPU Architecture
11
1. Graphics pipeline................................................................................................................. 11
2. Evolution of the Graphics Pipeline........................................................................................ 12
3. CPU VERSUS GPU................................................................................................................. 13

VII. Types..............................................................................................14
1.
2.
3.
4.
5.

Dedicated graphics cards..................................................................................................... 14
Integrated graphics solutions............................................................................................... 15
Hybrid solutions................................................................................................................... 15
Stream Processing and General Purpose GPUs (GPGPU)......................................................15
External GPU (eGPU)............................................................................................................ 15

VIII. GPU Accelerated Computing.............................................................16
IX..................................................................................................................... Advantages
17

Conclusion............................................................................................. 19
References............................................................................................. 20

NS-7800 Warda Ahmed

Page | 1

D-CE-37 (B)

............7 4: Contrast between VGA and 3Dfx Graphics...11 4..................................14 4...........................8 4........................ GPU – Graphics Processing Unit 3................................... AGP – Accelerated Graphics Port 8.............5: GPU graphics pipeline 1............... VPU – Visual Processing Unit 4....22 8: Programmable shading: Soap bubble effect...4 2: Transistor count trends for GPU cores.......................... Monochrome and Color Display Adapter (MDA/CDA) 14.................... CPU core... GIS – Geographic Information System 6.....2 Table of Figures Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure 1: GeForce 6600GT (NV43) GPU......................6 3: Atari 2600 released in September 1977.............................................7: Programmable Shader 1. PGA – Professional Graphics Controller 11......16 5(b): CPU core vs........................................................................................................................ Inverse discrete cosine transform (iDCT) 15........................... CAD – Computer Aided Design 7........... SGI – Silicon Graphics Inc....................17 6: GPU acceleration.........Computer Organization Assignment ................ RAM – Random Access Memory 5... 12................21 7: Multi-GPU performance.......................................................... PCI – Peripheral Component Interconnect 9......................... iMDCT – Inverse modified discrete cosine transform 16. 6: Fixed-Function rendering pipelines (FFP)................13 4.. TIGA – Texas Instruments Graphics Architecture 10.. Unified Shader Architecture...8 Non-unified vs..........22 9: Demonstration of realistic graphics with NVIDIA Geforce 8800 GTX................................................................................ GPU core..................................................... API – Application Programming Interface 13....... IQ – Inverse quantization 17...23 List of Acronyms 1.................................................................... CPU – Central Processing Unit 2.............................. VLD – Variable-length decoding 18.15 5(a): GPU core vs........................ IGP – Integrated graphics processors NS-7800 Warda Ahmed Page | 2 D-CE-37 (B) .......................................................................................

It accelerates computing and is a necessity for optimum computer performance. One essential type of processor in the majority of modern computers and electronic devices today is the graphics processing unit.2 Abstract Processors are the central component of computers that carry out all the computer’s work. and are fully programmable. It is mainly used for graphics processing and 3D design. Modern GPU processors are parallel. NS-7800 Warda Ahmed Page | 3 D-CE-37 (B) . Leading GPU manufacturers are NVIDIA and AMD. A GPU has a lot more cores than a CPU. The different types of GPU are dedicated. general purpose and integrated. integrated. NVIDIA developed the first GPU.Computer Organization Assignment . GPU architecture shifted from fixed graphics pipeline to programmable to a unified shader model.

Figure 1: GeForce 6600GT (NV43) GPU Some terms that need to be explained to understand GPUS. A GPU is used for 3-D applications and functions like 3D motion. It is a single-chip processor similar to the CPU. is a specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display.2 Processors The word ‘processor’ is short for microprocessor or central processing unit (CPU). Lifting this load from the CPU frees up cycles that can then be used for other jobs. analyze and process the input commands. It is the logic circuitry that responds to and processes the basic instructions that drive a computer. The processor in a personal computer or embedded in small devices is often called a microprocessor.Computer Organization Assignment . By using a CPU for system processing and a separate GPU processor for graphics processing.    Rendering: the process of generating an image from a model Vertex: the corner of a polygon (usually that polygon is a triangle) Pixel: smallest addressable screen element NS-7800 Warda Ahmed Page | 4 D-CE-37 (B) . which stands for ‘Graphics Processing Unit’. Also called visual processing unit (VPU). Modern GPUs are efficient at manipulating image processing and computer graphics. It is specifically designed for rendering graphics that are displayed as output on monitor screens. A processor is a small chip made of silicon and is a central component of computers and other electronic devices. Their highly parallel structure is more effective than general-purpose CPUs for algorithms where the processing of big blocks of visual data is done in parallel. many laptops and desktop computer also include a GPU. Graphics Processing Unit (GPU) I. The mathematically-intensive tasks would overburden the CPU. It creates lighting effects and transforms objects every time a 3D scene is redrawn. Modern processors can compute trillions of calculations per second. and produce an appropriate output. The main job of a processor is to receive input through input devices. What is a GPU? Aside from the CPU. the CPU is not overloaded and the computer can run graphic – intensive applications more efficiently.

4%. However. fixed function hardware pipeline implementation made solely for graphics.Computer Organization Assignment . those numbers include Intel's integrated graphics solutions as GPUs. with 49.2 II. Many scientists and engineers use GPUs for more in-depth calculated studies utilizing vector and matrix features. Future GPU generations will look more and more like wide-vector general purpose CPUs. workstations and game consoles. some have accelerated memory for mapping vertices. personal computers. including embedded systems. GPU Manufacturers Many companies have produced GPUs under a number of brand names. [1] III. [3] NS-7800 Warda Ahmed Page | 5 D-CE-37 (B) . Modern GPU processors are massively parallel. The most commonly used GPUs are developed by the companies AMD. and eventually both will be seamlessly combined as one. It is placed in a video card in desktop computers and integrated into the motherboard of mobile devices. to a set of highly parallel and programmable cores for more general purpose computation. and SiS.6% market share respectively. Intel. However. Some of the more modern GPU technology supports programmable shaders implementing textures. 27. cell phones. Via. Most GPUs use their transistors for 3-D computer graphics. Matrox Graphics. In 2009. Nvidia and ATI control nearly 100% of the market as of 2008. The parallel floating point computing power found in a modern GPU is a lot greater than a CPU. The trend in GPU technology has added more programmability and parallelism to a GPU core architecture that is ever evolving towards a general purpose more CPU-like core. Applications such as computer-aided design (CAD) can process over 200 billion operations per second and deliver up to 17 million polygons per second. [2] IV. NVIDIA. and supercomputers. Not counting those numbers. Uses GPUs are found in a wide range of systems. such as geographic information system (GIS) applications. Evolution of GPUS The evolution of GPU hardware architecture has gone from a specific single core. Intel. PowerVR. mathematical vertices and accurate color formats. and are fully programmable.8% and 20. Nvidia and AMD/ATI were the market share leaders.

A flurry of designs arrived in the latter half of the 1970s. Gremlin. They acted as a pass-through between the main processor and the display. as well as vertical and horizontal composite sync. Midway. The Galaxian hardware was widely used during the golden age of arcade video games. known as video shifters and video address generators. by game companies such as Namco. Irem. The incoming data stream was converted into serial bitmapped video output such as luminance. Konami.Computer Organization Assignment . multi-colored sprites and tilemap backgrounds. color. Sea Wolf (1976) and Space Invaders (1978). such as Gun Fight (1975). [5] The Namco Galaxian arcade system in 1979 used specialized graphics hardware supporting RGB color. Sega and Taito. laying the foundation for 3D graphics as we know them. NS-7800 Warda Ahmed Page | 6 D-CE-37 (B) . Nichibutsu. which kept the line of pixels in a display generation and synchronized each successive line along with the blanking interval (the time between ending one scan line and starting the next). Centuri. In early video game hardware the RAM for frame buffers was too expensive.2 The transistor count trends for some GPUs is shown in the following figure: Figure 2: Transistor count trends for GPU cores The history of GPUs is as follows: 1. Arcade system boards have been using specialized graphics chips since the 1970s. [4] Fujitsu's MB14241 video shifter was used to accelerate the drawing of sprite graphics for various 1970s arcade games from Taito and Midway. so video chips composited data together as the display was being scanned out on the monitor. Video Shifters (1970-1998) 3D graphics started with early display controllers.

It could run general-purpose code. [9] 2. freeing up the CPU for video processing (such as drawing and coloring filled polygons). 1980’s In the early 1980's. They were boards of TTL logic chips that relied on the CPU. NS-7800 Warda Ahmed Page | 7 D-CE-37 (B) . Though it was released in 1984. In 1990-1991. In 1986. Sinistar. The PGA's separate on-board processor marked an important step in GPU evolution to further the paradigm of using a separate processor for graphics computations [12]. Motorola unveiled the MC6845 video address generator. contain custom blitter chips for operating on 16-color bitmaps. In the late 1980's. [11] In 1985. The Williams Electronics arcade games Robotron: 2084 . its high cost and incompatibility with many programs and non-IBM systems made it unable to achieve mass-market success. the Atari 2600 in 1977 used a video shifter called the Television Interface Adaptor. or 64x32 for the RCA Studio II console. and Bubbles. the Commodore Amiga featured a custom graphics chip. [8] ANTIC also supported smooth vertical and horizontal scrolling independent of the CPU. The PGA used an on-board Intel 8088 microprocessor to take over processing all video related tasks. Rasterization of filled polygons. including the Tandy TRS-80. 10 years before hardware 2D/3D acceleration was standardized. The Atari 8-bit computers (1979) had ANTIC. more features were being added to early GPUs. supporting line draw. area fill and a blitter unit which accelerated manipulation of bitmaps. Texas Instruments released the TMS34010. but it had a very graphicsoriented instruction set. all released in 1982. Joust.2 RCA’s “Pixie” video chip (CDP1861) in 1976 could output a NTSC compatible video signal at 62x128 resolution. a video processor which interpreted instructions describing a "display list"—the way the scan lines map to specific bitmapped or character modes and where the memory is stored (so there did not need to be a contiguous frame buffer). and could only draw wire-frame shapes to raster displays [10]. [7] 6502 machine code subroutines could be triggered on scan lines by setting a bit on a display list instruction. such as Shaded Solids. In 1978. and Pixel depth buffer. the first microprocessor with on-chip graphics capabilities. One of the very first 2D/3D video cards for the PC was the IBM Professional Graphics Controller (PGA). Vertex lighting. "GPUs" were integrated frame buffers. and color blending. By 1987. In the home market. [6] It can be seen in Figure 2. this chip would become the basis of the Texas Instruments Graphics Architecture ("TIGA") Windows accelerator cards. This became the basis for the IBM PC’s Monochrome and Color Display Adapter (MDA/CDA) cards of 1981. and provided the same functionality for the Apple II. There was still much reliance on sharing computation with the CPU [10].Computer Organization Assignment . which made its way into September 1977 a number of first generation personal computers. Motorola added the MC6847 video display Figure 3: Atari 2600 released in generator later the same year.

It focused on raw power in fundamental 3d operations. 3. as shown in the following figure. With the introduction of OpenGL in 1989. Figure 4: Contrast between VGA and 3Dfx Graphics 3dfx planned to build high end 3D gaming board capable to deliver smooth gameplay at 640x480 resolution with bilineary filtered textures. Voodoo was easy to program and hard to slow down. 1990s Launched on November 1996.2 Silicon Graphics Inc. SGI also pioneered the concept of the graphics pipeline early on [12]. had professional promise as well. OpenGL support has also become an intricate part of the design of modern graphics hardware. In March of 1996 15 titles with Voodoo support debuted in E3 with wholly new levels of visual quality. 3Dfx's Voodoo graphics consisted of a 3D-only card that required a VGA cable pass-through from a separate 2D card to the Voodoo. NS-7800 Warda Ahmed Page | 8 D-CE-37 (B) . (SGI) emerged as a high performance computer graphics hardware and software company. SGI created and released the graphics industry's most widely used and supported. while the right side shows the improved resolution of the 3dfx on OpenGL. which then connected to the display. 3dfx cut the right corners of pipeline.Computer Organization Assignment . 2D/3D application programming interface (API). The difference in video quality can be seen in a demo of the popular game ‘Quake’. reducing gate count without much impact on image quality. platform independent. The left side shows ordinary low resolution graphics. Voodoo was used in arcade machines and through Quantum's multichip boards.

Now only AMD and NVIDIA are GPU manufacturing giants. like soft shadows and reflections. vertex and pixel processors. GPUs have gradually become more powerful. The GeForce 256 GPU was capable of billions of calculations per second. [13] It was a single-chip processor with integrated transform. can process a minimum of 10 million polygons per second. V. Nvidia Kepler: A graphical processing unit that holds the distinction of being the first GPU designed for the cloud. NVIDIA’s rival company ATI Technologies came up with the name ‘VPU’ or visual processing unit when they released the Radeon 9700 in 2002. compared to the 9 million found on the Pentium III. drawing and BitBLT support. in 1999 . AMD. Since their inception. when GPUs were a new concept. The First GPU (1999) The first company to develop the world’s first commercial GPU is NVIDIA Inc.Computer Organization Assignment . ATI and Matrox. designed for CAD applications. and ARM. [14] OpenCL is an open standard defined by the Khronos Group which allows for the development of code for both GPUs and CPUs with an emphasis on portability. Features GPU features include  2-D or 3-D graphics  Digital output to flat panel display monitors  Texture mapping NS-7800 Warda Ahmed Page | 9 D-CE-37 (B) . programmable. Expanding instruction set and CUDA. Early leading companies were Silicon Graphics International. Unified Shader Model. NVIDIA. 3dfx. there was a severe narrowing of competition. Nvidia. and general purpose with programmable geometry. Graphics cards powered by Nvidia Kepler processors are tuned to efficiently serve virtualized desktops. as well as depth of field blurring. Fairly early on in the GPU market. OpenCL is the GPGPU development platform most widely used by developers in both the US and Asia Pacific. Its workstation version called the Quadro. and according to a recent report by Evan's Data. 4. lighting effects. OpenCL solutions are supported by Intel. OpenCL. and has over 22 million transistors. motion blur. can process over 200 billion operations a second and deliver up to 17 million triangles per second. providing auto-scaling to the necessary performance level. triangle setup/clipping and rendering engines.2 3dfx’s technology became the forerunner of many image quality enhancements seen today.

 Local memory has the same scope rules as register memory. more commonly known as slice-level acceleration  Spatial-temporal deinterlacing and automatic interlace/progressive source detection  Bitstream processing (Context-adaptive variable-length coding/Context-adaptive binary arithmetic coding) and perfect pixel positioning. Constant. Local.Computer Organization Assignment .2  Application support for high-intensity graphics software such as AutoCAD  Rendering polygons  Support for YUV color space  Hardware overlays  MPEG decoding  GPU accelerated video decoding: More recent graphics cards decode high-definition video on the card. This is invaluable because this type of memory allows for threads to communicate and share data between one another. Constant. Memory Features The only two types of memory that actually reside on the GPU chip are register and shared memory.  Data stored in shared memory is visible to all threads within that block and lasts for the duration of the block. offloading the central processing unit. and Texture are all cached. 1. Global. While it would seem that the fastest memory is the best. NS-7800 Warda Ahmed P a g e | 10 D-CE-37 (B) . the other two characteristics of the memory that dictate how that type of memory should be utilized are the scope and lifetime of the memory:  Data stored in register memory is visible only to the thread that wrote it and lasts only for the lifetime of that thread. Local. The video decoding processes that can be accelerated by today's modern GPU hardware are:  Motion compensation (mocomp)  Inverse discrete cosine transform (iDCT)  Inverse telecine 3:2 and 2:2 pull-down correction  Inverse modified discrete cosine transform (iMDCT)  In-loop deblocking filter  Intra-frame prediction  Inverse quantization (IQ)  Variable-length decoding (VLD). and Texture memory all reside off chip. but performs slower.

however. GPUs are therefore designed for high parallelism and lower power consumption. and lasts for the duration of the host allocation.2  Data stored in global memory is visible to all threads within the application (including the host). NS-7800 Warda Ahmed P a g e | 11 D-CE-37 (B) . Using constant rather than global memory can reduce the required memory bandwidth. GPU Architecture A GPU is a heterogeneous chip multi-processor (highly tuned for graphics).Computer Organization Assignment . using texture memory can reduce memory traffic and increase performance compared to global memory GPU clock or Engine clock is the graphics processor unit's clock speed.  Constant and texture memory won’t be used here because they are beneficial for only very specific types of applications. texture memory is another variety of read-only memory on the device. 1. this performance gain can only be realized when a warp of threads read the same location. Constant memory is used for data that will not change over the course of a kernel execution and is read only. When all reads in a warp are physically adjacent. measured in megahertz (MHz). the graphics pipeline or rendering pipeline refers to the sequence of steps used to create a 2D raster representation of a 3D scene. New applications demand parallel processing and new computing devices are power constrained. VI.Similar to constant memory. Graphics pipeline In 3D computer graphics.

Vertex Processing Converts each vertex into a 2D screen position. A programmable vertex shader enables the application to perform custom transformations for effects such as warping or deformations of a shape. Information is generated that will NS-7800 Warda Ahmed P a g e | 12 D-CE-37 (B) . Triangle Setup Vertices are collected and converted into triangles. Primitive Assembly.5) : Bus interface/Front End Interface to the system to send and receive data and commands. Clipping This removes the parts of the image that are not visible in the 2D screen view such as the backsides of objects or areas that the application or window system covers.Computer Organization Assignment . and lighting may be applied to determine its color.5: GPU graphics pipeline 1 The various stages in the typical pipeline of a modern GPU (also seen in figure 4.2 Figure 4.

such as color. with textures in a user-defined way to generate custom shading effects." a programmable pixel shader enables the application to combine a pixel's attributes. Output is a depth (Z) value for the pixel." which may or may not wind up in the frame buffer if there is no change to that pixel or if it winds up being hidden. textures as well as other attributes associated with each pixel. The frame buffer memory is also often used to store graphics commands. depth and position on screen. Occlusion Culling Removes pixels that are hidden (occluded) by other objects in the scene.FIXED GRAPHICS PIPELINE (Fixed-Function rendering pipelines (FFPS)) NS-7800 Warda Ahmed P a g e | 13 D-CE-37 (B) . Frame Buffer Controller The frame buffer controller interfaces to the physical memory used to hold the actual pixel values displayed on screen. 1. But that just wasn’t cutting it for sophisticated effects like water and smoke. Parameter Interpolation The values for each pixel that were rasterized are computed. to unified shader models. Pixel Shader This stage adds textures and final colors to the fragments. the process of generating computer graphics was referred to as the graphics pipeline. texture. fog. 2.Computer Organization Assignment . and now uses universal shaders able to perform tasks. The graphic rendering mechanisms shifted from fixed graphic pipelines to programmable graphic pipelines. based on color. Rasterization The triangles are filled with pixels known as "fragments. its coverage and degree of transparency with the existing data stored at the associated 2D location in the frame buffer to produce the final color for the pixel to be stored at that location. Pixel Engines Mathematically combine the final fragment color. the process has been taken over by more flexible shaders. Evolution of the GPU Architecture Until recently.2 allow later stages to accurately generate the attributes of every pixel associated with the triangle. etc. Overtime. Also called a "fragment shader.

textures and triangles were sent to the GPU. but the functions themselves remained. The game logic. Figure 4. 6: Fixed-Function rendering pipelines (FFP) PROS:   The hardware was wired and narrowly specialized to perform standard operations on data. This made it much faster than the processor performing the same tasks. stencil buffers (for shadow volumes).2 Fixed-function meant that the developer could not configure the functions the FFPs performed. etc. The processing is visualized step by step in the following figure. Each step is explained in the previous topic (Graphics pipeline). It had new features like multiple blending modes. etc could be changed. Parameters like the colours of objects. which would take care of all the heavy processing. CONS: NS-7800 Warda Ahmed P a g e | 14 D-CE-37 (B) . fog effects. per-vertex Gouraud shading.Computer Organization Assignment .

PROGRAMMABLE SHADERS (Separated Shader Architecture) In order to provide more sophisticated graphics to users. The flow of process of programmable shaders can be seen in the following diagram. the other problem remained (i. or flicker in and out. And since the images that need to be displayed are very different. Figure 4.7: Programmable Shader 1 PROBLEMS:  While one part of the FFP was fixed.Computer Organization    Assignment . Vertex shaders would construct the 3D model and light the vectors making it up. Some of them became known as shaders. part of the pipeline doing nothing and sitting idle).2 FFP was limited by the amount of functions it could perform.e. 2. For example. If the graphics pipeline’s hardware wasn’t matched perfectly to the processing needs of the task. It was impossible to go back in the stages of the pipeline to make changes as required. the opaque surface was animated to flicker. the match was never perfect. transparent objects like water or smoke tended to look solid. The shaders were of three types. some of it sat idle. and they eventually became flexible enough to overcome most of the difficulties caused by a linear pipeline. NS-7800 Warda Ahmed P a g e | 15 D-CE-37 (B) . To counter this opacity. There was no variation or flexibility and thus no realistic graphics could be visualized. manufacturers started making the fixed function hardware at each stage of the pipeline more flexible.

Figure 4. 3. Pixel shaders would apply the textures and other effects. But one shader could only do one type of task while the other two shaders were idle. in a situation with a heavy geometry workload the system could NS-7800 Warda Ahmed P a g e | 16 D-CE-37 (B) . There are no more vertex. UNIFIED SHADERS The specialized logic like vertex shaders. and pixel shaders: just shaders. geometry.8 Non-unified vs. Unified Shader Architecture Unified Shader Architecture allows more flexible use of the graphics rendering hardware [15]. For example. pixel shaders and hardwired algorithms were replaced with many copies of one unified CPU design.8 shows a nonunified architecture versus a unified shader architecture. Shaders are now made so that they are no longer confined to a certain task.Computer Organization Assignment . A unified shader can do any of the three kinds of work.2 Geometry shaders would make the lines into surfaces. Figure 4. IB and OB are input and output buffers. This gives better load balancing. so it can do whatever needs doing instead of waiting for work it can do to come in. The advantage of unified approach is that one can have several shader cores and use them for any type of shader (I–IV in this example).

S3 Chrome 400. Qualcomm Adreno 200 series. That means that software for different kinds of shaders could be written in a more similar manner.Computer Organization Assignment .2 allocate most computing units to run vertex and geometry shaders. more computing units could be allocated to run pixel shaders. 3. [16] The unified shading architecture was introduced with the Nvidia GeForce 8 series. ATI and Nvidia both started making GPUs with unified shaders. CPU VERSUS GPU A simple way to understand the difference between a CPU and GPU is to compare how they process tasks. OpenGL 3. more efficient cores designed for handling multiple tasks simultaneously. that programmers use to get their software to use hardware effectively. A CPU consists of a few cores optimized for sequential serial processing while a GPU has a massively parallel architecture consisting of thousands of smaller. CPU core NS-7800 Warda Ahmed P a g e | 17 D-CE-37 (B) . and Direct X 10 implemented a unified shader instruction set. making the programmer’s job easier. Xbox 360's GPU. It is an Application Programming Interface. or API. In an uncommon piece of hardware and software changing at the same time to benefit from the changes in the other. Most graphics hardware currently uses DirectX to communicate with the applications being run. Figure 5(a): GPU core vs.3 (which offers a unified shader model) can still be implemented on hardware that does not have unified shading architecture. ATI Radeon HD 2000.PowerVR SGX GPUs and is used in all subsequent series. Intel GMA X3000 series. Microsoft tweaked it over time. Figure 5 shows the difference between their cores. In cases with less vertex workload and heavy pixel load.

while AMD solutions pack in more cores to increase processing power. Typical high-end graphics cards have 68 cores if it’s nVidia.Computer Organization Assignment . GPU core The amount of cores that GPUs have depends on the manufacturer. nVidia graphics solutions tend to pack more power into fewer chips.2 Figure 5(b): CPU core vs. NS-7800 Warda Ahmed P a g e | 18 D-CE-37 (B) . and ~1500 cores if it’s AMD.

Historically. Examples of such IGPs would be offerings from SiS and VIA circa 2004 [17]. but their bandwidth is so limited that they are generally used only when a PCIe or AGP slot is not available. Assignment . AMD's IGPs can use dedicated sideport memory. 2. but tend to be less capable. Such ports may still be considered PCIe or AGP in terms of their logical host interface. In early 2007.856 GB/s of memory bandwidth from system RAM. A few graphics cards still use Peripheral Component Interconnect (PCI) slots.Computer Organization VII. The term "dedicated" refers to the fact that dedicated graphics cards have RAM that is dedicated to the card's use. an integrated solution may find itself competing for the already relatively slow system RAM with the CPU. This is a separate fixed block of high performance memory that is dedicated for use by the GPU. computers with integrated graphics account for about 90% of all PC shipments. These solutions are less costly to implement than dedicated graphics solutions. modern integrated graphics processors such as AMD Accelerated Processing Unit and Intel HD Graphics are more than capable of handling 2D graphics or low stress 3D graphics. Dedicated GPUs for portable computers are most commonly interfaced through a non-standard and often proprietary slot due to size and weight constraints. nor does it necessarily interface with the motherboard in a standard fashion. 1. Integrated graphics solutions Integrated graphics solutions. shared graphics solutions. as it has minimal or no dedicated video memory. However. even if they are not physically interchangeable with their counterparts. Technologies such as SLI by Nvidia and CrossFire by AMD allow multiple GPUs to draw images simultaneously for a single screen. such as dedicated cards which you can plug into your desktop’s PCI-Express slot.2 Types GPUs come in different shapes and forms. assuming the motherboard is capable of supporting the upgrade. not to the fact that most dedicated GPUs are removable. or integrated graphics processors (IGP) utilize a portion of a computer's system RAM rather than dedicated graphics memory.Dedicated graphics cards The GPUs of the most powerful class typically interface with the motherboard by means of an expansion slot such as PCI Express (PCIe) or Accelerated Graphics Port (AGP) and can usually be replaced or upgraded with relative ease. increasing the processing power available for graphics. As a GPU is extremely memory intensive. A dedicated GPU is not necessarily removable. integrated solutions were often considered unfit to play 3D games or run graphically intensive programs but could run less intensive programs such as Adobe Flash. On certain motherboard. to graphical chips called integrated graphics chips. IGPs can be integrated onto the motherboard as part of the chipset. or within the same die as CPU (like AMD APU or Intel HD Graphics). however graphics cards can enjoy up to 264 GB/s of bandwidth between its RAM and GPU NS-7800 Warda Ahmed P a g e | 19 D-CE-37 (B) . which are built directly into the motherboard – the backbone component of your system. IGPs can have up to 29.

to make up for the high latency of the system RAM. 3. The most common implementations of this are ATI's HyperMemory and Nvidia's TurboCache. Technologies within PCI Express can make this possible. are beginning to pursue this approach with an array of applications. or for other tasks. Hybrid graphics cards are somewhat more expensive than integrated graphics. On-board graphics chips are often not powerful enough for playing the latest games. External graphics processors are often used with laptop computers. but often lack a powerful graphics processor (and instead have a less powerful but more energy-efficient on-board graphics chip). The two largest discrete GPU designers. They are generally suited to high-throughput type computations that exhibit data-parallelism to exploit the wide vector width SIMD architecture of the GPU.2 core. External GPU (eGPU) An external GPU is a graphics processor located outside of the housing of the computer. as opposed to being hard wired solely to do graphical operations. These share memory with the system and have a small dedicated memory cache. GPGPU can be used for many types of parallel tasks including ray tracing. This concept turns the massive computational power of a modern graphics accelerator's shader pipeline into generalpurpose computing power.Computer Organization Assignment . Laptops might have a substantial amount of RAM and a sufficiently powerful central processing unit (CPU). Stream Processing and General Purpose GPUs (GPGPU) It is becoming increasingly common to use a general purpose graphics processing unit (GPGPU) as a modified form of stream processor. While these solutions are sometimes advertised as having as much as 768MB of RAM. NS-7800 Warda Ahmed P a g e | 20 D-CE-37 (B) . 4. In certain applications requiring massive vector operations. This bandwidth is what is referred to as the memory bus and can be performance limiting. ATI and Nvidia. Older integrated graphics chipsets lacked hardware transform and lighting. this can yield several orders of magnitude higher performance than a conventional CPU. but much less expensive than dedicated graphics cards. this refers to how much can be shared with the system memory. Hybrid solutions This newer class of GPUs competes with integrated graphics in the low-end desktop and notebook markets. but newer ones include it [18]. 5.

Computer Organization VIII. Pioneered in 2007 by NVIDIA. to mobile phones and tablets. Assignment . consumer. applications simply run significantly faster. From a user's perspective. and small-and-medium businesses around the world. GPU accelerators now power energy-efficient datacenters in government labs. universities. GPUs are accelerating applications in platforms ranging from cars. This basic process can be seen in figure 6: Figure 6: GPU acceleration NS-7800 Warda Ahmed P a g e | 21 D-CE-37 (B) . GPU-accelerated computing offers unprecedented application performance by offloading compute-intensive portions of the application to the GPU. and enterprise applications. while the remainder of the code still runs on the CPU. to drones and robots. analytics. engineering.2 GPU Accelerated Computing GPU-accelerated computing is the use of a graphics processing unit (GPU) together with a CPU to accelerate scientific. enterprises.

Figure 7: Multi-GPU performance  2-D to 3-D graphics revolution. Figure 7 shows the increased performance of applications with the increased GPU usage. such as this simulation of refractive chromatic dispersion for a “soap bubble” effect in figure 8: NS-7800 Warda Ahmed P a g e | 22 D-CE-37 (B) .  Assignment . It also gives you the freedom to run your applications with full features and effects enabled.2 Advantages A multi-GPU system provides more than just performance gains. The introduction of programmable shading in 2001 led to several visual effects not previously possible.Computer Organization IX.

Realistic life-like graphics. Specialized logic chips now allow fast graphic and video implementations. Three of the 10 most powerful supercomputers in the world take advantage of GPU acceleration.Computer Organization Assignment . It allows offloading of large word intensive computations generally relevant to computer graphics processing to another processor. The NS-7800 Warda Ahmed P a g e | 23 D-CE-37 (B) . featuring actress Adrianne Curry on an NVIDIA GeForce 8800 GTX. they became not just an enhancement but a necessity for optimum performance of a PC. as figure 9 shows.2 Figure 8: Programmable shading: Soap bubble effect Modern GPUs can use programmable shading to achieve near-cinematic realism. GPU-based high performance computers are starting to play a significant role in large-scale modelling. improving with each new GPU architecture. GPU decreases load of the CPU. Eventually. It also consumes less power. Generally the GPU is connected to the CPU and is completely separate from the motherboard. This frees the main CPU to focus on other nonoffloadable transactions. Conclusion GPUs became more popular as the demand for graphic applications increased. Figure 9: Demonstration of realistic graphics with NVIDIA Geforce 8800 GTX    Furthermore.

without which it would be impossible to perform graphically intensive tasks like video-encoding. " Evolution of the Graphical Processing Unit. Springmann. McClanahan. etc." The Washington Post. ANTIC. ] [12 T.. [9] K." Atari8. and FREDDIE chips?". CTIA/GTIA.. Buck. E.". They are a central component in devices in this age.com. (December 1985). " "Atari 2600 Teardown: What's Inside Your Old Console?".com. "History and Evolution of GPU Architecture. "The History of the modern graphics processor". GTC 2010. gaming. [4] G. Crow. ] NS-7800 Warda Ahmed P a g e | 24 D-CE-37 (B) . (April 1984). [7] ""What are the 6502.Computer Organization Assignment . [2] ""GPU sales strong as AMD gains market share". " "Atari Fine Scrolling". [5] "Arcade/SpaceInvaders. Wiegers." COMPUTE! (67): 110." techreport. " "Atari Display List Interrupts"... " The Evolution of GPUs for General Purpose Computing. POKEY. [6] A. Singer.2 random access memory (RAM) is connected through the accelerated graphics port (AGP) or the peripheral component interconnect express (PCI-Express) bus." COMPUTE! (47): 161." Georgia Tech. E. Some GPUs are integrated into the northbridge on the motherboard and use the main memory as a digital storage area." 2004. ] [11 S.". Riddle. [10 I. References [1] techopedia." Computer Archeology. graphic editing. " "Blitter Information". decoding. Wiegers. [3] C. but these GPUs are slower and have poorer performance. [8] K.

" ExtremeTech. 2009. ] [16 J. "Graphics Processor Evolution: Pipeline to Unified Shader ] Architecture..Computer Organization Assignment . Amprimoz. Graphics and Memory. " "Integrated Graphics Solutions for Graphics-Intensive Applications""." CPU. ] [15 ""GeForce 8800 GTX: 3D Architecture Overview".. Beal. Graphics Processing Unit. "GPU .Graphics Processing Unit.. F.2 [13 V. ] [14 "Computer Systems Architecture. Tscheblockov. Sanford. [18 B." Lecture 23. ] NS-7800 Warda Ahmed P a g e | 25 D-CE-37 (B) . ""Xbit Labs: Roundup of 7 Contemporary Integrated Graphics ] Chipsets for Socket 478 and Socket A Platforms""." webopedia. [17 T.

2 P a g e | 26 D-CE-37 (B) .Computer Organization NS-7800 Warda Ahmed Assignment .