Professional Documents
Culture Documents
(Itec55a Project) Evolution of Gpu
(Itec55a Project) Evolution of Gpu
IMUS CAMPUS
Cavite Civic Center, Palico IV, City of Imus, Cavite
(046) 471-6607/436-6584
www.cvsu-imus.edu.ph
EVOLUTION OF GPU
SUBMITTED BY:
SHAIRA JANE S. DE MESA
VINCENT LEZTER B. FOMARAN
BSIT - 2F
SUBMITTED TO:
MR. MELVIN JAN A. GUARIN
CHAPTER I
I. INTRODUCTION ……………………...…………………………..…………..…. 1
I. SUMMARY ……………………………………………………………………… 29
I. GLOSSARY ……………………………………………………………………... 31
III. PICTURES
I. INTRODUCTION
GPU stands for "Graphics Processing Unit." A GPU is a processor designed to handle
graphics operations. This includes both 2D and 3D calculations, though GPUs primarily excel
at rendering 3D graphics. Early PC’s did not include GPUs, which meant the CPU had to handle
all standard calculations and graphics operations. As software demands increased and graphics
became more important (especially in video games), a need arose for a separate processor to
render graphics. On August 31, 1999, NVIDIA introduced the first commercially available
GPU for a desktop computer, called the GeForce 256. It could process 10 million polygons per
second, allowing it to offload a significant amount of graphics processing from the CPU. The
success of the first graphics processing unit caused both hardware and software developers
alike to quickly adopt GPU support. Motherboards were manufactured with faster PCI slots
and AGP slots, designed exclusively for graphics cards, became a common option as well.
Software APIs like OpenGL and Direct3D were created to help developers make use of GPUs
in their programs. Today, dedicated graphics processing is standard not just in desktop PCs but
also in laptops, smartphones, and video game consoles. The primary purpose of a GPU is to
render 3D graphics, which are comprised of polygons. Since most polygonal transformations
involve decimal numbers, GPUs are designed to perform floating-point operations (as opposed
to integer calculations). This specialized design enables GPUs to render graphics more
efficiently than even the fastest CPUs. Offloading graphics processing to high-powered GPUs
is what makes modern gaming possible. While GPUs excel at rendering graphics, the raw
power of a GPU can also be used for other purposes. Many operating systems and software
EVOLUTION OF GPU 1
Technologies like OpenCL and CUDA allow developers to utilize the GPU to assist the CPU
in non-graphics computations. This can improve the overall performance of a computer or other
electronic device. The more we excel computers daily, is the more reason why we should need
more powerful GPUs in order to use for computations. The reason why we should use GPU
because it also aids the CPU in its workload and makes your technological devices power-
GENERAL OBJECTIVE
The main objective of this study is to show the evolution the GPU as well as
history.
SPECIFIC OBJECTIVES
1. To provide information about the history and how the GPU evolved.
The significance of this study is to provide information about GPU to understand the
Also, this study will act as a reference for students/researchers who will create a study
about it.
EVOLUTION OF GPU 2
CHAPTER II
ATI produced several models in its Wonder family between 1986 and the early 1990s.
All of them were extremely simplistic, designed to handle text and rudimentary 2D images
ATI continued to improve its display technology, culminating in the Mach series of 2D
graphics accelerators. The first implementation was the Mach 8, which introduced more
advanced 2D features.
ATI later incorporated the functions of both its Wonder and Mach product lines into a
The Mach 32 was succeeded by the Mach 64, which accelerated 2D graphics like the
rest of the family. Later, ATI added 3D graphics processing capabilities. This marked ATI's
first entry into the 3D gaming market, and the end of the Mach line-up.
EVOLUTION OF GPU 3
ATI 3D Rage (1995)
ATI sold its first 2D/3D graphics accelerators under two brand names. We already
mentioned the Mach 64. Now we need to discuss its successor: 3D Rage. The first 3D Rage-
based cards were identical in every way to the 3D-capable Mach 64s. They even used a Mach
64 2D graphics core.
The 3D Rage Pro incorporated several improvements over the 3D Rage II. For instance,
it was designed to work with Intel's Accelerated Graphics Port. ATI also added support for
several new features like fog and transparent images, specular lighting, and DVD playback. It
also upgraded the triangle setup engine and made numerous tweaks to the core to boost
performance. The 3D Rage Pro operated at 75 MHz, or 15 MHz higher than the 3D Rage II.
Maximum memory jumped to 16MB of SGRAM. But while performance did increase
compared to ATI's own 3D Rage II, the 3D Rage Pro failed to distinguish itself against Nvidia's
ATI's next project was much more ambitious. The company incorporated support for
32-bit color and a second-pixel pipeline, which allowed the Rage 128 to output two pixels per
clock instead of just one. It fed the architecture with a 128-bit memory interface, too. As a
means to further improve performance, the Rage 128 leveraged what ATI called its Twin Cache
Architecture consisting of an 8KB pixel cache and an 8KB buffer for already-textured pixels.
EVOLUTION OF GPU 4
ATI Rage Fury MAXX (1999)
Unable to beat the competition with its Rage 128 Pro, ATI took a page out of 3dfx's
book and created a single graphics card with two Rage 128 Pro processors. That card came to
be known as the Rage Fury MAXX, and it employed Alternate Frame Rendering technology.
Using AFR, each graphics chip rendered all of the odd or all of the even frames, which were
In 2000, ATI switched to the Radeon branding still in use today. The first member of
that family was simply called the Radeon DDR, and it was based on the R100 GPU. The R100
was an evolution of the Rage 128 Pro, but it featured a hardware Transform and Lighting (T&L)
engine. It also had two-pixel pipelines and three TMUs. ATI added a feature called HyperZ
In 2001, ATI transitioned to 150 nm manufacturing. The first GPU to benefit from the
new node was RV200, used in the All-In-Wonder Radeon 7500 (along with the Radeon 7500
and 7500 LE). Architecturally, RV200 was identical to R100. But it allowed ATI to push core
EVOLUTION OF GPU 5
ATI All-In-Wonder Radeon 8500 (2001)
The All-in-Wonder Radeon 8500 used ATI's R200 GPU with four-pixel pipelines and
two TMUs per pipe, along with a pair of vertex shaders. Through its implementation, ATI
supported Microsoft's Pixel Shader 1.4 spec. The company also rolled out HyperZ II with R200,
For the Radeon 9700 Pro, ATI used a completely new architecture. Its R300 GPU
employed eight-pixel pipelines with one texture unit each, along with four vertex shaders,
dramatically increasing geometry processing and textured fillrate. The big 110 million-
transistor chip was manufactured using a 150 nm process, just like R200, but it enjoyed higher
In an attempt to combat the Radeon 9700 Pro, Nvidia launched its GeForce FX 5800
Ultra. This narrowed the performance gap but was not enough to overtake the Radeon 9700
Pro. To cement its position, ATI introduced a subtle update called the Radeon 9800 Pro that
utilized an R350 GPU at higher clock rates. ATI later followed up with a model sporting 256
MB of memory.
EVOLUTION OF GPU 6
ATI Radeon X800 XT (2004)
The rivalry between ATI and Nvidia continued as Nvidia launched its GeForce 6800
GPU and reclaimed its technology and performance leader in the graphics card market. ATI
fired back with its X800 XT. The card's R420 GPU had 16-pixel pipelines that were organized
into groups of four. Compatibility was limited to Shader Model 2.0b at a time when Nvidia's
NV40 had SM 3.0 support. But the GPU also had 16 TMUs, 16 ROPs, and six vertex shaders.
R420 connected to 256 MB of GDDR3 over a 256-bit bus and used a new memory compression
technique called 3Dc, which helped make the use of available bandwidth more efficiently.
Not long after its X800 XT launch, ATI introduced the Radeon X700 XT powered by
its RV410 GPU. The RV410 was essentially half of an RV420, with eight-pixel pipelines, eight
TMUs, eight ROPs, and a 128-bit memory bus. RV410 had the same number of vertex shaders
as R420, though, and was manufactured using a 110 nm process. Clocked similarly to the X800
XT, the X700 XT was competitive against other mid-range graphics cards like Nvidia's
GeForce 6600.
Late in 2004, ATI launched a new flagship called the Radeon X850 XT PE. This card
used an R480 core built with 130 nm transistors. It was essentially just a die shrink of the R420
but operated at somewhat higher clock rates. This resulted in a modest performance boost, and
the X850 XT PE was more competitive against Nvidia's GeForce 6800 Ultra.
EVOLUTION OF GPU 7
ATI Radeon X1800 XT (2005)
The original X1000-series flagship was the Radeon X1800 XT. It came armed with four
quads (16-pixel pipelines), eight vertex shaders, 16 TMUs, and 16 ROPs. The core used a
unique memory interface that consisted of two 128-bit buses operating in a ring. Data moved
on and off the ring-bus at four different points. This effectively increased memory latency, but
reduced memory congestion due to the unique way the dispatch processor handled workloads.
The following year, ATI launched the R580 GPU, which powered its Radeon X1900
XTX. The key difference between the Radeon X1800 XT and Radeon X1900 XTX was that
the Radeon X1900 XTX had three times as many pixel pipelines (48) and quads (12). The rest
ATI later shifted the Radeon X1900 XTX design to an 80 nm process. This resulted in
the RV580+ core inside of its Radeon X1950 XTX. The core was otherwise unchanged but
managed to achieve higher clock rates. Further, ATI paired it with GDDR4 memory. The
ATI introduced an almost entirely new architecture called TeraScale to power its
Radeon HD 2000-series products. This was ATI's first unified shader architecture, and it was
EVOLUTION OF GPU 8
also the first design introduced after ATI's merger with AMD. TeraScale was designed to be
fully compatible with Pixel Shader 4.0 and Microsoft's DirectX 10.0 API. It first appeared
inside of the R600 core, which powered the Radeon HD 2900 XT flagship.
Later in 2007, ATI introduced its Radeon HD 3870. This successor to the Radeon HD
2900 XT used essentially the exact same design but transitioned to 55 nm manufacturing. AMD
also optimized the memory interface and upgraded it to PCIe 2.0. The most notable addition to
the 3870's R670 GPU was UVD, ATI's hardware-accelerated video decode engine.
As ATI's Radeon HD 2900 XT and Radeon HD 3870 were unable to compete with
Nvidia's latest, the company desperately needed a new high-end GPU. This came in 2008 in
the form of the RV770, inside of its Radeon HD 4870 utilizing the same architecture as its
predecessor on a 55 nm process. The core had 800 Stream processors, 40 TMUs, and 16 ROPs
connected to either 512 MB or 1 GB of memory on a 256-bit bus. Since GDDR5 was relatively
new at the time, it only operated at 900 MHz. Still, this gave the Radeon HD 4870 an abundance
of memory bandwidth.
ATI went through the RV770 die and tweaked it to facilitate higher clock rates,
resulting in the Radeon HD 4890's RV790 GPU. This card was clocked 100 MHz faster than
the Radeon HD 4870 but was otherwise identical. Performance increased, if only slightly.
EVOLUTION OF GPU 9
ATI Radeon HD 5870 (2009)
The Radeon HD 4890 didn't sit on top of ATI's product line-up for long, as the company
launched its Radeon HD 5870 later the same year. It used a new TeraScale II architecture
designed to support DirectX 11. A key improvement moving from TeraScale to TeraScale II
was that the individual Stream processors were capable of handling a wider array of
instructions.
The 6000-series flagship, AMD's Radeon HD 6970, shipped with 1536 Stream
processors, 96 TMUs, and 32 ROPs. The 6970's Cypress GPU was manufactured using 40 nm
transistors. It also used the same 256-bit bus but had access to 2 GB of GDDR5. The
architectural improvements allowed the 6970 to outperform AMD's previous effort while also
The Radeon HD 7970 outperformed all other single-GPU graphics cards by a wide
margin. In some games, it was able to beat dual-GPU cards like the Radeon HD 6990 and
GeForce GTX 590 as well. It consumed more power than its predecessor, though.
Following the release of the Radeon HD 7970, Nvidia fired back with its GeForce GTX
680, which was slightly faster. AMD responded by pushing up the clock rate on Tahiti to 1000
EVOLUTION OF GPU 10
MHz, creating the Radeon HD 7970 GHz Edition. It also introduced AMD's Boost feature, and
in certain situations, the card jumped to 1050 MHz. AMD also pushed up its memory clock to
6 GT/s. Those specifications allowed AMD's Radeon HD 7970 GHz Edition to match and
AMD's Radeon HD 8000 series was made up entirely of rebadged cards. Most of them
came from the 7000 series and were based on the GCN architecture, but a few drew from the
AMD's Radeon HD 7970 GHz Edition was one of the company's longest-standing
flagships. Eventually, it was replaced by the Radeon R9 290X and its Hawaii GPU, which was
manufactured using the same 28 nm lithography and based on an updated version of GCN. The
size of the L2 cache increased from 768 KB to 1 MB, and AMD improved bandwidth between
on-die resources. AMD also introduced its TrueAudio technology, leveraging DSP cores to
Stuck at 28 nm manufacturing and faced with a big, hot GPU, AMD went with liquid
cooling to make the Fury X smaller and quieter than would otherwise be possible. The Radeon
R9 Fury X outperformed Nvidia's GeForce GTX 980 but traded blows with its GeForce GTX
EVOLUTION OF GPU 11
980 Ti. As a result, determining the last generation's king often came down to game selection
AMD's Radeon RX 480 is somewhat unique in that it was not designed to be the
company's fastest graphics card. Instead, it was built to be an efficient mid-range board. You
can expect higher-end solutions based on the same Polaris architecture in the months to come.
In April 2017, AMD introduced the RX 500 series. Or rather it re-introduced the RX
400 series under a new name. These GPUs are nearly identical to their RX 400-series
predecessors. The RX 500 series GPUs, however, feature significant bumps in clock speed and
support higher voltage limits. AMD didn't create a reference design for the RX 580, but models
produced by OEMs are typically clocked between 100-200MHz above the RX 480.
Polaris. While the pixel shaders used in Vega and Polaris are relatively similar to each other,
Vega has the clear advantage thanks to a significant increase in the number of shaders, plus
faster memory. AMD implemented HBM2 memory with Vega and has announced consumer
GPUs with 8GB of GBM2 with a peak bandwidth of 484 GB/s over a 4096-bit bus.
EVOLUTION OF GPU 12
AMD RX 5700 XT
And, well, this graphics card succeeds. The AMD Radeon RX 5700 XT takes its
original target, the Nvidia GeForce RTX 2070, to the task, delivering great gaming
performance, along with a plethora of new features. Even compared to the new Nvidia GeForce
RTX 2060 Super, the AMD Radeon RX 5700 XT still holds its own.
EVOLUTION OF GPU 13
II. EVOLUTION OF NVIDIA GRAPHICS CARD
Nvidia NV1
Nvidia took off back in 1993, but the company did not have its first product until 1995,
the NV1. The GPU was very innovative for its time and could handle both 2D and 3D video.
Other than that the Nvidia NV1 was used in the Sega Saturn game console. The NV1 used a
PCI interface with 133 MB/s of bandwidth. Other than that the graphics card used EDO
memory clocked at up to 75 MHz, and the card supported a resolution of 1600×1200 with 16-
bit color.
The Nvidia NV3 came out in 1997 and turned out to be more successful than the
previous Nvidia GPU. It switched from using quadrilaterals to polygon and hence it was easier
to provide support for the Riva 128 in games. While the new Nvidia GPUs could render frames
Nvidia NV4
Next year, 1998, the NV4 came out and it was very popular indeed. This graphics card
was able to change the name of the game for the company in the graphics cards world. When
the NV4 came out, it was the 3dfx’s Voodoo2 that was the performance king, but it was
EVOLUTION OF GPU 14
Nvidia NV5
In 1999, Nvidia launched the NV5 and this was another attempt to become the
performance king. The card was able to deliver around 17% better performance as compared
to the previous one, even though both cards were based on the same architecture. The NV5
came with 32 MB RAM, which was double as compared to the previous one. The transition to
the 250nm process allowed for the graphics card to be clocked at 175 MHz.
Cards that came before the GeForce 256 were called graphics accelerators or video
cards but the GeForce 256 was called a GPU. Nvidia introduced some new features in this
graphics card including Transform and Lighting processing. This allowed the graphics card to
perform calculations that were traditionally handed over to the CPU. The GeForce 256 was 5
times better at this as compared to the modern CPU at that time, the Pentium 3.
Nvidia GeForce2 graphics cards were based on the same architecture as the previous
graphics cards but Nvidia was able to double the TMUs by moving to the 180nm process.
The NV11, NV15, and NV16 cores were used in order to power the GeForce2 graphics cards
and they had slight differences. The NV11 core featured two-pixel pipelines, while the NV15
and NV16 cores had four, and NV16 operated at higher clock rates as compared to the NV15.
EVOLUTION OF GPU 15
NV20 / GeForce3
This was the first graphics card from Nvidia that was DX8 compatible. The core was
based on the 150nm process and featured 60 million transistors and 250 MHz clock speed. This
was also the first graphics card from Nvidia to feature Lightspeed Memory Architecture.
GeForce4
These graphics cards came out in 2002. There was an array of graphics cards coming
out at this point. At the entry-level, we had the NV17, which was an NV11 GeForce2 but had
been shrunk down using the 150nm. This made it cheaper to produce. The clock speed of the
NV30 / FX 5000
In 2002, Microsoft released DX9 which would be a very popular API for the next
couple of years. Both ATI and Nvidia rushed to get supporting hardware to the market. ATI
was able to beat Nvidia but later in 2002 Nvidia released the FX 5000 series. While Nvidia
graphics cards came to a while later, they did come with some additional features that set them
EVOLUTION OF GPU 16
NV40: Nvidia GeForce 6800
A year later, the Nvidia GeForce 6800 was released that came with 222 million
transistors, 16-pixel superscalar pipelines, six vertex shaders, Pixel Shader 3.0 support, and 32-
bit floating-point precision. The 6800 Nvidia GPU featured 512MB of GDDR3 memory over
a 256-bit bus. The 6000 series as very successful as it provided double the performance in some
games as compared to the FX 5950. While it was powerful, it was also efficient.
introducing mid-ranged products once again and that is where the 6600 Nvidia GPU came into
play. The NV43 came with half the resources as compared to the NV40. While it only used
a 128-bit bus, it was shrunk down using the 110nm process. It did have half the resources, and
The 7800 Nvidia GPU replaced the 6800. While it was based on the same 110nm
process it did feature 24 TMUs, 8 vertex shaders, and 16 ROPs. The 256MB of GDDR3 could
be clocked to 1.2 GHz over a 256-bit bus. The core itself ran at 430 MHz.
EVOLUTION OF GPU 17
GeForce 8000 Series
Nvidia introduced the Tesla architecture with the 8000 series. The architecture was used
in multiple Nvidia GPUs including GeForce 8000, GeForce 9000, GeForce 100, GeForce 200,
and GeForce 300 series of Nvidia GPUs. The flagship of the 8000 series was the 8800 which
was based on the G80 GPU. The chip-based on the 80nm process featured 681 million
transistors.
The 9000 series Nvidia GPUs also used the Tesla architecture but the chips were shrunk
down using the 65nm process. This allowed the Nvidia GPUs to reach 600 to 675 MHz clock
speeds while reducing the power consumption of the graphics cards. The reduced heat output
and lower power consumption allowed Nvidia to released dual GPU graphics card and it was
the GeForce 9800 GX2 dual GPU that was the flagship at the time.
An improved revision of Tesla was introduced with the 200 series Nvidia GPUs in
2008. The improvements included improved scheduler and instruction set, a wider memory
interface, and an altered core ratio. The GT200 used 10 TPCs with 24 EUs and 8 TMUs each.
EVOLUTION OF GPU 18
GeForce 400
The 400 series Nvidia GPUs came out in 2010. These were based on the Fermi
architecture. The GF100 was the top of the line chip which featured 4 GPCs. Each GPC
PolyMorph Engine. All that adds up to a total of 512 CUDA cores, 64 TMUs, 48 ROPs, and
16 PolyMorph Engines.
The 500 series still used the Fermi architecture but Nvidia was able to improve it by re-
working at a transistor level. This allowed the Nvidia GPUs to have lower power consumption
and higher performance. All things included the 580 was faster than the 480.
600 Series
The 680 was the graphics card that replaced the 580 and it was based on the Kepler
architecture. This was the time that Nvidia GPUs were based on the 28nm process. This shift
allowed the graphics cards to feature twice as many TMUs and three times as many CUDA
cores. This might not have increased performance by threefold, but an increase in performance
EVOLUTION OF GPU 19
GTX 900 Series
The Maxwell architecture was introduced in 2014 and power consumption was a big
focus for the new architecture. The GM204 was the chip that power the top of the line GTX
980. The chip featured 2048 CUDA cores, 128 TMUs, 64 ROPs, and 16 PolyMorph engines.
While the new chip was only around 6% better as compared to the 780Ti the major selling
point was the lower energy that is consumed. The difference was of a major 33%.
In 2016, Nvidia GPUs moved on to the 16nm process and Nvidia released the new
Pascal architecture, on which modern graphics cards are still based. If you have a 10 series
GPU then you are using Pascal. The GeForce GTX 1080 features 7.2 billion transistors. With
2560 CUDA cores, 160 TMUs, 64 ROPs, and 20 PolyMorph engines. This is significantly
Spec for spec, the RTX 2080 is a step up from its predecessors in every way, packing
more CUDA cores, faster GDDR6 video memory and even the first very 90Mhz factory
Additionally, the Nvidia GeForce RTX 2080 introduces a new set of 46 RT Cores and
468 Tensor Cores, which play a hand in rendering real-time ray tracing and artificial
intelligence-powered computations.
EVOLUTION OF GPU 20
III. EVOLUTION OF INTEL GRAPHICS CARD
In 1998, Intel launched its first graphics card: the i740 code-named "Auburn." It was
clocked at 220MHz and employed a relatively small amount of VRAM between 2 and 8MB.
Comparable cards of the time typically included at least 8MB and ranged up to 32MB. It also
supported DirectX 5.0 and OpenGL 1.1. In order to get around the shortage of on-board
memory, Intel planned to take advantage of a feature built into the AGP interface that allowed
the card to utilize system RAM. As such, the i740 uses the onboard memory only as a frame
After the i740 disaster, Intel developed and briefly sold a second graphics card named
the i752 "Portola", though in very limited quantities. Around the same time, Intel began using
its graphics technology inside of chipsets like the i810 ("Whitney") and i815 ("Solano"). The
GPU was incorporated into the northbridge, becoming the first integrated graphics processors
sold by Intel. Their performance was dependent on two factors: RAM speed, which was often
linked to the FSB, and in turn dependent on the processor, and the CPU itself. At the time, Intel
used 66, 100 or 133MHz FSB configurations alongside asynchronous SDRAM, giving the
EVOLUTION OF GPU 21
Intel Extreme Graphics (2001)
In 2001, Intel began its Extreme Graphics family, which was closely related to the
previous generation, including two-pixel pipelines and limited MPEG-2 hardware acceleration.
Software API support was nearly identical to the i815 chipsets, although OpenGL support was
Intel reused the two-pixel-pipeline graphics chip one more time in its Extreme Graphics
2 family, released in 2003. The company again introduced two versions. A mobile
implementation surfaced first, appearing in the i852 and i855 chipsets designed for the Pentium
M. These versions of the chip operated anywhere between 133 and 266MHz, depending on the
OEM's design choice. The second version was used inside of the i865 Springdale chipsets
designed for the Pentium 4. These always operated at 266MHz and used faster DDR memory
that could operate at up to 400MHz, giving them higher bandwidth than prior iGPUs.
In 2004, Intel brought its Extreme Graphics line-up to an end, retiring the two-pixel-
pipeline core that was used in all prior Intel GPUs. The Graphics Media Accelerator (or GMA)
would be the name used by Intel to sell its graphics technology for the next several years. The
first of these products, GMA 900, was integrated into the i915 chipset family
(Grantsdale/Alviso). It featured support for DirectX 9.0 and contained four-pixel pipelines, but
EVOLUTION OF GPU 22
GMA 950: Pentium 4 & Atom (2005)
The GMA 950 was integrated into Intel's i945 (Lakeport and Calistoga) chipsets and
enjoyed a relatively long life. These chipsets were capable of working with Pentium 4, Core
Duo, Core 2 Duo, and Atom processors. The architecture was nearly identical to the GMA 900,
however, inheriting many of the same weaknesses, including a lack of vertex shaders. The core
did receive some minor software compatibility improvements, extending DirectX support to
9.0c. This was an important update to the graphics chip, as it enabled Aero support on Windows
Vista. Performance increased slightly thanks to a frequency bump (400MHz) and support for
faster processors and RAM. Mobile versions of the GPU could be as clocked as low as 166MHz
In 2006, Intel again changed its graphics nomenclature, starting with the GMA 3000.
This was a considerable step up from the older GMA 950 in terms of performance and
technology. The previous generation was limited to four fixed-function pixel pipelines without
vertex shadings. Meanwhile, the new GMA 3000 included eight multi-purpose EUs that were
capable of performing multiple tasks including vertex calculations and pixel processing. Intel
pushed its clock rate to 667MHz as well, making it's GMA 3000 considerably faster than the
EVOLUTION OF GPU 23
GMA X3000 (2006)
After the GMA 3000, Intel made another modification to its naming, creating a fourth
generation of GPUs. The GMA X3000 was nearly identical to the GMA 3000, however, only
incorporating minor changes. The most significant difference between them was that GMA
3000 could only use 256MB of system memory for graphics, while the GMA X3000 could use
up to 384MB. Intel also extended GMA X3000's video codec support to include full MPEG-2
After the X3000s, Intel only designed one more chipset series with integrated graphics.
The Intel GMA 4500 family was composed of four models, all of which used the same 10-EU
architecture. Three versions were released for desktop chipsets. The slowest of these was GMA
4500, which operated at 533MHz. The other two were the GMA X4500 and X4500HD, both
clocked at 800MHz. The primary difference between the X4500 and X4500HD was that the
HD model featured full hardware acceleration of VC-1 and AVC, while the X4500 and GMA
4500 didn't.
Larrabee (2009)
In 2009, Intel made another attempt to enter the graphics card business with Larrabee.
Realizing that its immense understanding of x86 technology was the company's primary
strength, Intel wanted to create a GPU based on the ISA. Development for Larrabee began with
the original Pentium CPU, which Intel opted to modify in order to create the scalar unit inside
EVOLUTION OF GPU 24
of the GPU instead of starting from scratch. The old processor design was significantly
Intel introduced the HD Graphics product line in 2010 to pick up where the GMA
family left off. The HD Graphics engine in the first-gen Core i3, i5, and i7 processors was
similar to GMA 4500, except it had two additional EUs. Clock speed stayed about the same,
ranging between 166MHz in low-power mobile systems and 900MHz on higher-end desktop
SKUs. Although the 32nm CPU die and 45nm GMCH weren't fully integrated into a single
piece of silicon, both components were placed onto the CPU package.
With Sandy Bridge, the Intel HD Graphics took another step up in terms of
performance. Instead of placing two separate dies on one package, Intel instead merged the
hardware into a single die, further reducing latency between the components. Intel also
extended the graphics chip's feature set with Quick Sync technology for transcoding
acceleration and a more efficient video decoder. API support didn't improve much, extending
only to DirectX 10.1 and OpenGL 3.1, but clock speed increased significantly, ranging between
EVOLUTION OF GPU 25
Xeon Phi (2012)
After Larrabee was canceled, Intel shifted its design goals for the underlying
technology. While Larrabee could have been quite capable of gaming, the company saw a
future for it in compute-heavy applications and created the Xeon Phi in 2012. One of the first
models, the Xeon Phi 5110P, contained 60 x86 processors with large 512-bit vector units
clocked at 1GHz. At that speed, they were capable of more than 1 TFLOPS of computing
With the introduction of Ivy Bridge, Intel redesigned its graphics architecture. Similar
to the Sandy Bridge iGPUs, Ivy Bridge was sold in three different models: HD (GT1 with six
EUs and limited decoding/encoding), HD 2500 (GT1 with six EUs and complete
decoding/encoding functionality) and the HD 4000 (GT2 with 16 EUs and complete
decoding/encoding functionality). Operating at lower 1150MHz clock rates than the Intel HD
3000, but with four additional EUs, the HD 4000 was significantly faster, averaging a 33.9
Haswell (2013)
Architecturally, the HD Graphics engines released alongside Haswell are similar to that
inside of Ivy Bridge and can be viewed as an extension of the Ivy Bridge graphics core. Intel
relied on sheer force in order to drive up GPU performance in Haswell. This time around, the
company did away with the small six-6 EU model, opting to place 10 EUs inside of Haswell's
EVOLUTION OF GPU 26
GT1 implementation. Full video decoding was enabled, while other features like accelerated
Broadwell (2014)
With Broadwell, Intel once again redesigned its iGPU so that it could scale more
effectively. The architecture organized its EUs into sub-slices of eight. This made the process
of adding EUs even easier as Intel could duplicate the subslice multiple times. The GT1
implementation contained two slices (although only 12 EUs are active). The next three
products, HD Graphics 5300, 5500, 5600 and P5700, use the GT2 chip with 24 EUs (although
Skylake (2015)
The latest version of Intel's graphics technology is used inside of the Skylake-based
CPUs. These graphics chips are closely related to the Broadwell iGPUs, share the same
architectural layout and keep the number of EUs essentially the same across the board. The
biggest change with this generation is a change in the naming convention. Intel changed the
nomenclature to the HD Graphics 500 series instead. The low-end HD Graphics and HD
Graphics 510 models use the GT1 die with 12 EUs. The HD Graphics 515, 520, 530 and P530
EVOLUTION OF GPU 27
Kaby Lake (2017)
With the release of Intel’s Kaby Lake processors, Intel pushed out an improved iGPU:
Intel HD Graphics 630. Like its predecessor, the HD 630 has 24 EUs and operates at similar
clock speeds. Intel improved the fixed-function hardware used for decoding and encoding to
support improved 4K playback and streaming. This also improved energy efficiency. Beyond
that, the Intel HD Graphics 630 iGPU is essentially the same as the Intel HD Graphics 530
Coffee Lake is Intel's codename for the second 14 nm process node refinement
following Broadwell, Skylake, and Kaby Lake. The integrated graphics on Coffee Lake chips
allow support for DP 1.2 to HDMI 2.0 and HDCP 2.2 connectivity. Coffee Lake natively
supports DDR4-2666 MHz memory in dual channel mode when used with Xeon, Core i5, i7
and i9 CPUs, DDR4-2400 MHz memory in dual channel mode when used with Celeron,
Pentium, and Core i3 CPUs, and LPDDR3-2133 MHz memory when used with mobile CPUs.
EVOLUTION OF GPU 28
CHAPTER III
SUMMARY, CONCLUSION
I. SUMMARY
chip processor that creates lighting effects and transforms objects every time a 3D scene is
redrawn. These are mathematically-intensive tasks, which otherwise, would put quite a strain
on the CPU. Lifting this burden from the CPU frees up cycles that can be used for other jobs.
A GPU can be overclocked which means it can surpass its speed limit in order to have more
power and speed. GPU evolves as how people would want it to. For example in games and
work. Humans would want a much faster and powerful GPU so that when they work it may be
much easier to do. In gaming, what feeds people's mind into GPUs are the graphics and
workload. Realistic graphics would mean much more power and speed from a GPU. In this
year Ray Tracing was invented, in Ray Tracing, objects, places, water, etc... Are mirrored just
like in the real world. This is a step in bringing gaming into a reality. The more we want the
game to be more real, is the more the GPU evolves. More ease in works means more
II. CONCLUSION
The question that a lot of smartphone and tablet buyers don’t usually think of, but one
introduction to GPUs in mobile devices, how to determine whether yours is any good, and why
they are important in modern devices at all. The GPU is something of an afterthought to most
EVOLUTION OF GPU 29
people when it comes to buying a mobile device. Many associate the GPU as a computer thing,
with high-end graphics cards being an integral part of gaming PCs and high-powered
workstations for decades now. However, mobile devices like your smartphone also contain a
GPU as part of the chipset and are used just as often as a desktop-grade GPU. Popular phone
GPUs are Qualcomm’s Adreno series, the PowerVR series found in Apple phones, and the
Nvidia Tegra line. Odds are, one of those three is in your tablet or phone right now. The GPU
of your device is so important mainly because it makes games run more efficiently and makes
them look better with higher resolution graphics and improved framerates, or how many frames
per second the game runs at. Higher framerates mean smoother, faster games with less stutter
or freezing due to load on the CPU. The GPUs of modern smartphones are capable of rendering
3D games and a lot of effects easily and this allows developers to make better looking and more
complex games as a result. The GPU also aids the CPU in its workload and makes your device
more power-efficient and faster altogether. As years pass by, we claim to want more out of
every GPU's, we want more realistic graphics, faster computations, etc. As computers and
people evolve, GPUs evolve as well, in order to meet the expectations of people. In the next
few years, we might just have to use GPUs that are capable of being realistic in gaming. People
want to push gaming into reality, so graphics in gaming will evolve as well.
EVOLUTION OF GPU 30
CHAPTER IV
I. GLOSSARY
based in Santa Clara, California that develops computer processors and related technologies
ATI - ATI Technologies Inc. was a semiconductor technology corporation based in Markham,
Ontario, Canada, that specialized in the development of graphics processing units and chipsets.
Founded in 1985 as Array Technology Inc., the company listed publicly in 1993. Advanced
NVIDIA for general computing on graphical processing units (GPUs). With CUDA,
developers are able to dramatically speed up computing applications by harnessing the power
of GPUs.
In GPU-accelerated applications, the sequential part of the workload runs on the CPU
– which is optimized for single-threaded performance – while the compute intensive portion of
the application runs on thousands of GPU cores in parallel. When using CUDA, developers
program in popular languages such as C, C++, Fortran, Python and MATLAB and express
EVOLUTION OF GPU 31
GPU - A Graphics Processing Unit (GPU) is a single-chip processor primarily used to manage
processing unit, which typically handles computation only for computer graphics, to perform
eliminating the source and display data interception. HDCP enhances security during electronic
data transport of high-bandwidth media, such as videos and audios. Key exchange occurs
between the source and display device prior to the authentication process.
HDCP was launched by Intel Corporation in the mid-1990s and later licensed by Digital
audio/video (A/V) connectivity. Pioneered early in the 21st century, the first HDMI equipment
went into production in 2003. The HDMI technology is now common in a wide range of
consumer devices, including smartphones, digital video cameras, and Blu-Ray or DVD
devices. It carries an uncompressed digital signal that is adequate for high-definition audio and
video presentations.
iGPU - The iGPU stands for the Integrated Graphics Processing Unit. That setting controls
the amount of memory you give to the integrated graphics on your motherboard. Typically this
technology company incorporated in Delaware and based in Santa Clara, California. It designs
EVOLUTION OF GPU 32
graphics processing units for the gaming and professional markets, as well as systems on a chip
PCI - A Peripheral Component Interconnect (PCI) slot is a connecting apparatus for a 32-bit
computer bus. These tools are built into the motherboards of computers and devices in order to
allow for the addition of PCI devices like modems, network hardware or sound, and video
cards.
OpenCL - OpenCL (Open Computing Language) is a framework for writing programs that
execute across heterogeneous platforms consisting of central processing units (CPUs), graphics
processing units (GPUs), digital signal processors (DSPs), field-programmable gate arrays
OpenGL – OpenGL is mainly considered an API (an Application Programming Interface) that
provides us with a large set of functions that we can use to manipulate graphics and images.
However, OpenGL by itself is not an API, but merely a specification, developed and
EVOLUTION OF GPU 33
II. REFERENCES
https://en.wikipedia.org/wiki/Graphics_processing_unit
https://www.tomshardware.com/picturestory/735-history-of-amd-graphics.html
https://www.tomshardware.com/picturestory/693-intel-graphics-evolution.html
https://www.tomshardware.com/reviews/nvidia-geforce-rtx-2080-super-turing-ray-
tracing,6243.html
https://segmentnext.com/2018/06/01/evolution-of-nvidia-gpus/
https://en.wikipedia.org/wiki/Advanced_Micro_Devices
https://en.wikipedia.org/wiki/ATI_Technologies
https://www.techopedia.com/definition/24862/graphics-processing-unit-gpu
https://www.quora.com/What-is-an-iGPU
https://en.wikipedia.org/wiki/Intel
https://en.wikipedia.org/wiki/Nvidia
https://www.techopedia.com/definition/8816/pci-slot
https://www.techopedia.com/definition/3103/high-definition-multimedia-interface-hdmi
https://www.techopedia.com/definition/553/high-bandwidth-digital-content-protection-hdcp
https://www.reddit.com/r/Amd/comments/c1cf6l/a_timeline_of_amds_gpu_architectures/
https://en.wikipedia.org/wiki/OpenCL
https://developer.nvidia.com/cuda-zone
https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units
EVOLUTION OF GPU 34
https://learnopengl.com/Getting-started/OpenGL
https://en.wikipedia.org/wiki/Direct3D
EVOLUTION OF GPU 35
III. PICTURES
EVOLUTION OF GPU 36
NVIDIA GRAPHICS CARD TIMELINE
EVOLUTION OF GPU 37
NV20 / GeForce3 GeForce4 (2004)
EVOLUTION OF GPU 38
GeForce 8000 Series GeForce 9000 Series
EVOLUTION OF GPU 39
GTX 900 Series (2014)
EVOLUTION OF GPU 40
INTEL GRAPHICS CARD TIMELINE
The GMA 900 (2004) GMA 950: Pentium 4 & Atom (2005)
EVOLUTION OF GPU 41
The GMA 3000, 3100 & 3150 (2006) GMA X3000 (2006)
Larrabee (2009)
EVOLUTION OF GPU 42
First-Generation Intel HD Graphics (2010)
EVOLUTION OF GPU 43
Ivy Bridge: The Intel HD 4000
(2012)
Haswell (2013)
Broadwell (2014)
EVOLUTION OF GPU 44
Skylake (2015)
EVOLUTION OF GPU 45