You are on page 1of 4

EmbEddEd

Power Of GPUs Today


Though CPUs once had four times as many transistors as GPUs, GPUs today have moved far ahead with their processing capabilities bringing supercomputing experience to the desktop. These can handle enormous data processing from standard-definition to high-definition, and from 2D to 3D

ShweTa DhaDiwal BaiD

raphics processing units (GPUs) started getting into desktop computers for graphics enhancement a decade ago. Today, with their immense parallel computing capability, GPUs have made their way even to supercomputers. Nebulae a supercomputer built on a cluster of 4640 blade serversranks as the second most powerful computer in the world in the June list of the fastest supercomputers by TOP500, a project that ranks most powerful computers in the world. It uses NVIDIA Tesla series general-purpose GPUs, exploiting the massively-parallel capabilities and providing unbelievable 3 petaflops/s peak capabilitythe highest ever on the TOP500. Video and multimedia are today the centre stage for every user. Narendra Bhandari, director, Intel Software

and Services Group, says, The average mainstream user is expecting a relatively rich visual experience in almost every platform. From a low-end core to a high-end system, the expectation of the user in terms of video quality or just user experience is always on a higher side. GPUs have enabled the best user experience from high-definition (HD) to three-dimension (3D), from the living room to personal computing. The key driving factors for GPUs are the three Csconsumer, computing and communication, says Nishant Goyal, head of sales, South Asia for NVIDIA. The evolution of consumers lifestyle with increasing usage of technology products such as set-top boxes, LCD TVs and digital cameras has forced the industry to provide easy and efficient solutions. Fusion of CPU and GPU has been successful in delivering supercomputing capability. These are

being used in applications like scientific image processing, spatial exploration, oil exploration, 3D reconstruction and even stock options price determination.

GPUs as a green supercomputer


In 2006, while delivering a keynote at International Conference on ComputerAided Design (ICCAD), AMDs ex-chief technology officer Phil Hester stated that it was important to examine the capabilities of GPUs for the round two of killer micros. GPUs have moved from simple wire-frame rendering in the 1980s to 32bit parallel-processing engines. Hester predicted that the combination of CPUs and GPU functions in heterogeneous cores would boost supercomputers performance, which is now a reality. GPUs today are highly parallel general-purpose single-instruction multiple-data (SIMD) processor arrays. These not only boost the computing
w w w. e f y m ag . co m

6 2 J a n ua ry 2 0 1 1 e l e c t ro n i c s f o r yo u

EmbEddEd
and physical effects, you crave at least one high-end discrete GPU. But if you want to create a home media centre that is quiet and power-efficient, an integrated GPU may be the best for you. There are solutions where you can include the best of both worlds (discrete and integrated). Goyal adds, A system having both integrated and discrete GPUs switches intelligently between the two depending on what application you are running. So you can enjoy an unprecedented combination of power efficiency and performance. From a designers perspective, Discrete GPUs are good for highly accelerated software, where a certain processing block can be completely offloaded to the GPU. In the case of a software needing continuous data transfer between a host and the GPU, an integrated GPU would perform better than a discrete GPU, as there is a lot of data movement across GPU and CPU. So it really depends on the softwarehow is it designed and will it perform better on integrated GPUs than discrete GPUs, shares Kapil Agrawal, designer and CEO, Mediamagic Technologies. Bhandari adds, You will see almost five times improvement in data transfer when the GPU is closer to the CPU as in the case of an integrated platform. Whatever the needs of end users, its clear that the right GPUwhether its discrete or integratedis the key to a great computing experience. As almost every application becomes graphically oriented, its the GPU in your system which will ensure that you get the best bang for your buck in a device.

Nebulae supercomputer combines GPUs with CPUs

capabilities of CPUs but also greatly reduce the power consumption. Eight out of the worlds greenest supercomputers combined specialised accelerators like GPUs with CPUs to boost performance and power efficiency, according to the Green500 list which is released twice a year. Tianhe-1A, the latest supercomputer revealed in October 2010, parallels a large number of GPUs with multi-core CPUs to significantly boost its performance and reduce power consumption and size. In one of the releases, Guangming Liu, chief of National Supercomputer Center in Tianjin, said, The performance and efficiency of Tianhe-1A was simply not possible without GPUs. Other supercomputers combining GPUs with CPUs include the Dawning Nebulae and the Mole-8.5.

at tackling large amounts of similar data because the problem can be split into hundreds or thousands of pieces and calculated simultaneously. As sequential or serial processors, CPUs are not designed for this type of computation, but these are more adept at serialbased tasks such as running operating systems and organising data. Goyal shares, We believe that the concept of hybrid computing or coprocessing where GPUs and CPUs work together will apply the most relevant processor to the specific task in hand, creating synergies.

Discrete vs integrated GPUs


As with any tool, its very important to have the right GPU for the right task. Graphics processors primarily exist in two form-factors, viz, discrete and integrated. Bhandari explains, An integrated graphics processor means that when you buy a processor core system, the graphics component is part of the processor and you dont need to buy an additional piece of silicon or additional core to run the 3D graphics or high-end video. But, when you buy an additional core dedicated to graphics enhancement, its called a discrete solution. Explaining further, Goyal says, If you are a gamer who demands the latest, greatest bleeding-edge entertainment experience with advanced features like stereoscopic 3D Vision, HD resolutions

GPGPUs revolutionise co-processing


Co-processing with digital signal processor (DSP) blocks and floating-point units (FPUs) has enhanced the performance of CPUs. However, experts believe that a GPU co-processor will certainly mark a revolutionary step in the world of coprocessing. GPUs have evolved more as general-purpose graphics processing unit (GPGPUs) mainly because of four reasonsmassive parallelism, power, ubiquity and usability due to new tools and ease of development. As parallel processors, GPUs excel

Graphics on CPU silicon


Earlier, integrated GPUs meant GPU and CPU as separate pieces of silicon on the same motherboard. With technological advancements, integrated GPUs have started including graphics onto the same die as the processor, shares Bhandari. Bringing graphics and processor core onto the same die has a lot of advantages, explains Bhandari. There
w w w. e f y m ag . co m

6 4 J a n ua ry 2 0 1 1 e l e c t ro n i c s f o r yo u

EmbEddEd
Graphics on Linux
Kapil Agrawal, designer and CEO, Mediamagic Technologies, shares, Earlier, it was very difficult to utilise all the resources of a GPU inside Linux, as there were neither proper APIs to offload hardware acceleration needed by video encoding/decoding, nor enough tools from GPU companies as they always saw non-Linux as the biggest market. But, with many users and companies moving towards Linux, tools are now available to do software development on GPUs in Linux. Also, GPU companies have released APIs that can now directly talk with the GPU for hardware acceleration, like VDPAU from NVIDIA, VAAPI from Intel and XVBA from ATI. is a significant improvement in data transfer speed. Most of the products in this category have a significant cache size on the processor. Now a graphics processor can take advantage of this high-speed cache available right next to it. The input/output traffic flow improves quite a bit, as the traffic moves from the motherboard to the die itself. Bhandari explains, Its analogous to the situation where you are running a factory. You are doing the computing there, taking the assembled goods, putting them in a truck, sending them to a separate graphics factory, getting the processing done and bringing the goods back. Now you are doing it inside the factory right next to your assembly line. So the transfer time for the data back from the memory is greatly reduced. Power management is another important advantage. For high-performance tasks like music ripping or video compression-decompression, the hardware works hard for a short duration of time and maintains its low-power state for the rest of the time. for the rest of the computerCPU and memory controller. The Fusion technology promises performance at par with some of the discrete graphics chips. However, it cannot be compared or serve as an alternative to high-end discrete graphics chips for serious gamers. One major advantage of Fusion approach in APUs is its minimal access times and maximum bandwidth. The introduction of APUs does not mean a replacement of the CPU in the design. Vamsi Krishna, manager-product marketing APAC, AMD, explains, CPUs will still be the driving brain. Without the classic CPUs you cannot have the millions of applications running on your product. In the performance desktop space, discrete graphics chips will still lead the graphics computing power. He says, The bulk of general-purpose computing will move towards APUs giving you the advantage of integration, price and real estate. become central to almost every aspect of computing and this is expected to be a long-term trend. Goyal informs, Stereoscopic 3D is a huge trend right now. Within less than a year, 3D has gathered incredible global momentum. The whole industry ecosystem has very much embraced the technology. 3D-capable panels from LG, Samsung, Viewsonic, Alienware and Acer, and 3D Vision enabled desktop and notebook PCs from major OEMs are evidence of this. The advent of 3D Blu-ray and YouTube confirms that 3D is much more than a gimmickits here to stay. Another noteworthy trend is seen in professional graphics. There has been an inflection point with the emergence of computational visualisation era. For high-precision, data-sensitive applications, professional graphics solutions with error-correcting code memory and fast double-precision capabilities ensure the accuracy and fidelity of your results. These not only serve as a graphics processor but also drive an entire visual supercomputing platform, incorporating hardware and software that enable advanced capabilities such as stereoscopic 3D, scalable visualisation and 3D high-definition broadcasting. Multi-display solutions. Buying a large-screen panel may be very expensive, but its now easy to stretch the video into three or six smaller screens without compromising on the quality of video. Krishna adds, The eyefinity feature enables you to view the video split into smaller screens joined together to form a large screen, so that you can enjoy high-end gaming. The way you have USB port hubs, you now get display port hubs. With one hub, you can connect up to four monitors, take a display port connector and expand it to four monitors. This will give an altogether different game play experience. Both the gamers and productivity guys can take advantage of this, especially if they have to work on multiple monitors. Video stabilisation. Krishna shares
w w w. e f y m ag . co m

Fusion GPUs accelerate processing


While Intel tries to put the CPU and GPU on the same socket for a low-level integration, the accelerated processing unit (APU) from AMD provides a higher level of integration. An APU is basically an x86 CPU combined with memory controller and graphics processor. One major difference between discrete graphics chips and APUs is that in discrete graphics chips the entire silicon and all of its transistors face pixel-crunching tasks, whereas in APUs the technology bandwidth is divided

intensive computing applications


Today, graphics cards not only provide the best graphics but also support applications like video transcoding, video acceleration, facial recognition and facial tagging where the data is massive and needs parallel processing capabilities. Krishna says, GPUs are becoming more and more generic processors than what was intended before. While the primary objective is graphics, people have started extracting the secondary objective from it. 3D home entertainment to personal computing. Graphics processors have

6 6 J a n ua ry 2 0 1 1 e l e c t ro n i c s f o r yo u

EmbEddEd
that the most critical application enabled by programmability of GPUs is image and video stabilisation. He adds, Digital still and video cameras are common gadgets and not all of the users are professionals. The images are blurred and shaky, especially when using zoom for far away objects. So a lot of software algorithm supported by GPUs is loaded on these gadgets for image and video stabilisation. Techniques like processing each frame and removing the unwanted ones, and matching brightness and colour are used to give an enjoyable experience to the user. Facial tagging. Digitisation has added not only lots of features to photography but also a lot more photographs to our albums. Over the years, when you have 50,000-60,000 pictures, it is very difficult to find all the pictures of a particular person if you had not tagged them initially. Thanks to highly intelligent and smart facial tagging software algorithms, you can tag all the 50,000 pictures with just one click, says Krishna. These are intensive computing applications which need a capable graphics chip as well as good software. Video transcoding. Video transcoding allows you to watch an HD video on your HD TV set as well as on your Android-based smartphone with ten times smaller screen. Basically, the algorithm allows you to convert a 4GB file into a 100MB file when you view the same video on different types of screens. Krishna informs, A quad-core CPU will do the same task in 2-2.5 hours, but with massively parallel GPUs, the conversion is done within 15-20 minutes seamlessly. Video upscaling. It is another algorithm very critical in plasma and LCD TVs available today. The DVD collection of your video library is probably suitable for standard definition (SD), but you cannot replace every SD-DVD with an HD video to view on your new plasma TV. A lot of algorithm goes in your HD TV set to support your old SD videos. It basically uses interpolation and adds a lot of data to existing data so that you get near-high-definition experience. Gesture recognition. Today, there are a number of sensors added to your phone, laptop and desktop PC in order to perform various tasks like gesture recognition, motion recognition, eye tracking and multi-gesture recognition. Bhandari shares, These kinds of applications use three 3D cameras and all the data is captured and processed in real time. Even playing a fast-paced car game requires a millisecond response time. Such applications need a good balance of computing and graphics both. Every improvisation on a video basically involves parallel processing capabilities. And GPUs today offer intense hardware capability fit for these applications. Its now the software and algorithm that need to exploit the hardware capability to provide the best visual experience. massively-parallel applications start choking the CPU capabilities, the software will take the advantage of GPUs. This is where open computing language (OpenCL) comes into picture. OpenCL is a framework for writing programs that execute across heterogeneous platforms consisting of CPUs, GPUs and other processors, explains Krishna. For optimal utilisation of the GPU, the software design plays a crucial role. The same GPU chip can perform very differently when it comes to using software. Justifying this, Agrawal says, A high-end Nvidia GPU can perform worse than an Intel CPU like i3 or i5 if the software designer doesnt know how to use the parallel processing being provided by CUDA in Nvidia GPU, or he is unable to break the software in parallel processing units. It also matters what kind of software you are designing. A software which has no inherent parallel blocks can perform better on a GPU needing non-parallel scheduling like Intels.

GPUs for rich experience


India is a country which prizes top-class entertainment very highly. Expectedly, increasingly rich, technologically advanced entertainment is becoming a major driver for growth in the graphics industry, says Goyal. Whether it is movies and TVs moving from standard- to high-definition, games shifting from 2D to stereoscopic 3D, or the explosion of demanding Flash-based Web applications like Facebook and games such as Farmville, all these trends require the GPU. Without GPUs parallelism, it is impossible to develop power-efficient supercomputers, experts believe.
The author is a senior technology journalist at EFY

Software: Key to unlock hardware potential


The software is an important element that helps determine the performance and efficiency of a given product. With the advent of DirectX and OpenGL (open graphics library), GPUs can process short programs for each pixel, adding programmability to the chip. Millions of applications that are already there in the market will continue to drive the classic microprocessor/microcontroller industry. Where

6 8 J a n ua ry 2 0 1 1 e l e c t ro n i c s f o r yo u

w w w. e f y m ag . co m

You might also like