Developing Trends of AI Chips

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/348555212
Developing Trends of AI Chips (A Technical Overview)
Technical Report · January 2021

DOI: 10.13140/RG.2.2.11634.94404
CITATIONS READS
0 301
1 author:
Mirza Mansab
Xiamen University
1 PUBLICATION 0 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
opinion mining using Transformers and Transfer learning View project
Artificial intelligence Chips (AI_Chips) View project
All content following this page was uploaded by Mirza Mansab on 17 January 2021.
The user has requested enhancement of the downloaded file.

Developing Trends of AI Chips
(A Technical Overview)
Prepared by: Mirza Mansab Baig

Introduction
Artificial intelligence will play a significant role in national and international security in the years
to come. As a result, many governments are seeing how to control the circulation of AI-related
information and technologies. Because general-purpose AI software, datasets, and algorithms
are not effective targets for controls, the attention naturally falls on the computer hardware
necessary to implement modern AI systems. The success of modern AI techniques relies on
computation on a scale unimaginable even a few years ago. Training a leading AI algorithm can
require a month of computing time and cost $100 million.
This huge computational power is delivered by computer chips that not only pack the maximum
number of transistors but also are tailor-made to efficiently perform specific calculations
required by AI systems. Such leading-edge, specialized “AI chips” are essential for cost-effectively
implementing AI at scale. trying to deliver the same AI application using older AI chips or general-
purpose chips can cost tens to thousands of times more. The fact that the complex supply chains
needed to produce leading-edge AI chips are concentrated in the world and a small number of
allied democracies provides an opportunity for export control policies.
This report presents the above story in detail. It explains how AI chips work, why they have
flourished, and why they matter. It also shows why leading-edge chips are more cost-effective
than older generations, and why chips specialized for AI are more cost-effective than general-
purpose chips. As part of this story, the report surveys semiconductor industry and AI chip design
trends shaping the evolution of chips in general and AI chips in particular. It also presents a
consolidated discussion of technical and economic trends that result in the critical cost-
effectiveness tradeoffs for AI applications.
AI Chips Basics
AI chips, as the term suggests, refers to a new generation of microprocessors which are
specifically designed to process artificial intelligence tasks faster, using less power. Definition of
“AI chips” includes graphics processing units (GPUs), field-programmable gate arrays (FPGAs),
and certain types of application-specific integrated circuits (ASICs) specialized for AI calculations.
Our definition also includes a GPU, FPGA, or AI-specific ASIC implemented as a core on system-
on-a-chip (SoC). AI algorithms can run on other types of chips, including general-purpose chips
like central processing units (CPUs), but we focus on GPUs, FPGAs, and AI-specific ASICs because
of their necessity for training and running cutting-edge AI algorithms efficiently and quickly, as
described later in the paper.
Like general-purpose CPUs, AI chips gain speed and efficiency by incorporating huge numbers of
smaller and smaller transistors, which run faster and consume less energy than larger transistors.
But unlike CPUs, AI chips also have other, AI-optimized design features. These features
dramatically accelerate the identical, predictable, independent calculations required by AI
algorithms.
They include executing a large number of calculations in parallel rather than sequentially, as in
CPUs; calculating numbers with low precision in a way that successfully implements AI algorithms
but reduces the number of transistors needed for the same calculation; speeding up memory
access by, for example, storing an entire AI algorithm in a single AI chip; and using programming
languages built specifically to efficiently translate AI computer code for execution on an AI chip.
Different types of AI chips are useful for different tasks. GPUs are most often used for initially
developing and refining AI algorithms; this process is known as “training.” FPGAs are mostly used
to apply trained AI algorithms to real world data inputs; this is often called “inference.” ASICs can
be designed for either training or inference.
1. Why Cutting-Edge AI Chips are Necessary for AI.

Because of their unique features, AI chips are tens or even thousands of times faster and more
efficient than CPUs for training and inference of AI algorithms. State-of-the-art AI chips are also
dramatically more cost-effective than state-of-the-art CPUs as a result of their greater efficiency
for AI algorithms. An AI chip a thousand times as efficient as a CPU provides an improvement
equivalent to 26 years of Moore’s Law-driven CPU improvements. Cutting-edge AI systems
require not only AI-specific chips, but state-of-the-art AI chips.
Older AI chips with their larger, slower, and more power-hungry transistors incur huge energy
consumption costs that quickly balloon to unaffordable levels. Because of this, using older AI
chips today means overall costs and slowdowns at least an order of magnitude greater than for
state-of the-art AI chips. These cost and speed dynamics make it virtually impossible to develop
and deploy cutting-edge AI algorithms without state-of-the-art AI chips.
Even with state-of-the-art AI chips, training an AI algorithm can cost tens of millions of U.S. dollars
and take weeks to complete. In fact, at top AI labs, a large portion of total spending is on AI-
related computing. With general-purpose chips like CPUs or even older AI chips, this training
would take substantially longer to complete and cost orders of magnitude more, making staying
at the research and deployment frontier virtually impossible. Similarly, performing inference
using less advanced or less specialized chips could involve similar cost overruns and take orders
of magnitude longer.
AI chip is becoming more and more important, and there are more and more application
scenarios.
1. There are two reasons for the development of dedicated chips: first, the problem to be
solved is important enough for us to spend valuable hardware resources to solve it.
second, the algorithm for this problem should have characteristics and be able to use the
"circuit approach" to deal with it efficiently.
2. If you want to achieve Al, it is not enough to have AI algorithm, and you must have AI chip,
which is the guarantee of computing power.
3. At present, the implementation schemes of artificial intelligence acceleration chip are
GPU, FPGA and ASC. In different application scenarios, we will choose different solutions.
2. Why are dedicated chips on the rise?
So why have special-purpose chips become an important point of competition in recent years?
I'm going to talk about the development of AI. Now we can use AI for many things, such as face
recognition unlocking of mobile phones, voice-controlled intelligent speakers, and even self-
driving cars. These seem to be new things that have emerged in recent years, but artificial
intelligence has been studied for decades. Why do I feel like there has been a big explosion of
artificial intelligence in recent years?
It is because these three elements have not been collected in the past: big data, algorithm and
computing power. Take face recognition as an example. In a study based on convolution neural
network in 1997, there were only 400 pictures of 40 people. But in 2014, the deep convolution
neural network algorithm used by Facebook collected 4 million pictures of 4000 people for model
training, and achieved 97.35% accuracy, which is close to the human level!
But don't think that big data alone can make face recognition more successful. Ordinary general-
purpose chips simply can't handle such massive operations. For example, if you want to make a
self-driving car, it will automatically avoid obstacles. This requires a chip to calculate the evasion
route according to real-time traffic conditions. If the car is on the road and there is an obstacle
20 meters ahead, if you use the CPU in the computer to calculate "do I want to avoid it or not", I
am afraid it will have to be calculated until the damage assessment staff of the insurance
company have arrived at the scene, and this result has not yet been calculated.
Therefore, if you want to achieve AI, it is not enough to have AI algorithm, but also must have AI
chip, which is the guarantee of computing power. The AI chip we are talking about is actually
called "artificial intelligence acceleration chip". It is a special chip, the function is to speed up the
AI algorithm. For example, just said that face recognition can be successful, because the use of
GPU to do computing, GPU has become the earliest AI chip.
Why can GPU be used as an AI chip?
Because we find that the AI algorithm is also very characteristic-although it has a large amount
of computation, it also has a strong regularity. For example, the "convolution neural network
algorithm" commonly used in the field of image recognition, one of the main operations is to
carry out a large number of multiplication operations for large matrixes. If we can deal with this
characteristic, we can naturally improve the speed of calculation.
Coincidentally, GPU is much better than CPU at this point. Slowly, people found that "GPU can
not only deal with the acceleration of image problems, most parallel computing problems can be
solved with GPU, including bitcoin mining. “Although compared with CPU, GPU is a special chip,
but today, GPU has become a general-purpose parallel computing chip. In this way, GPU
technology has a great driving force for development, is widely used in parallel computing, but
also promotes the development of all kinds of AI chips.
3. AI chip frontier.
Now, in addition to GPU, in order to better achieve artificial intelligence, there are new AI chips. Let me
introduce you to two of the most important ones.
The first one is called FPGA (Field-programmable gate array).
As I just said, although GPU can speed up the AI algorithm more than CPU, GPU is now a general-
purpose parallel computing chip. Its problem is that it is not optimized for every artificial
intelligence problem, and the power consumption and price are relatively high, so we want "more
dedicated chips" to improve efficiency, and the emergence of FPGA is to solve this problem.
This is a very clever invention; you can think of it as a "universal chip”. After the chip is done, you
can modify the connection form of the devices in the chip as needed to form a variety of chips
with different functions. This process is called "FPGA burning”. For different artificial intelligence
problems, we can change the FPGA into a corresponding special chip to better adapt to the
problem!
Figure 1 Island-style global FPGA architecture.
For example, you can think of FPGA as "Flash express delivery”. Although "Flash delivery" is a
general service, anyone can use it, but once you place an order, this Flash delivery courier will
only serve you. He will establish an optimal path between you and your goal and deliver things
as quickly as possible. After the end of the service, he will take the next order and set up a new
route for new users. For Flash express delivery, "placing an order" corresponds to "programming
burning of FPGA". Once burned, the FPGA becomes your dedicated chip-efficiently solving your
specific problems. After completing the task, you can also change the circuit structure according
to the new problem and turn it into another dedicated chip to complete the new task efficiently.
Therefore, FPGA is a very important form of implementation in artificial intelligence. FPGA has
advantages in flexibility, but it also has weaknesses: generally speaking, its price is relatively high,
and in terms of performance, speed, power consumption and chip area, it also has a lot of room
for improvement. So people also thought of an ultimate method, which is also the hottest and
most cutting-edge field of chip technology, called "ASIC customized chip".
In other words, a chip is specially designed according to the AI problem to be solved, and it does
nothing else. Its advantages and disadvantages are very obvious, the advantage is "very efficient,
energy consumption will be very low". The disadvantage is that "there is a complete loss of
versatility, and once you design and make this chip, it can't do anything else." If you want it to
solve other problems in two days, there is no way, you have to make another chip.
But now the cost of chip design and processing of advanced technology is very high, so
customized AI chips cannot be done by any company, that is, big companies like Google and Ali
will do this only if they have clear application scenarios and algorithms. Now the most famous
custom AI chip is Google's TPU, called Tensor processor. According to Google's public data, TPU
can improve performance and reduce energy consumption by dozens of times and hundreds of
times compared to the best GPU.
Finally, I would like to add that there is actually an important development trend, that is,
"general-purpose AI chips". This "universal" does not mean to solve all computing problems like
CPU, but hopes that an AI chip can meet the requirements of low cost and versatility under the
premise of high efficiency and low power consumption, and solve all kinds of AI problems. This
will be a very promising direction for the future!
4. AI chips with everything

So now onto the companies that we think are the top developers of AI chips, although not in any
particular order – just companies that have showcased their technology and have either already
put them into production or are very close to doing so. And, as a reminder, we’ve tried to stick
the definition of AI chips that we outlined at the beginning of the article, rather than just
superfast conventional microprocessors.
In other words, these microprocessors listed here are being described as “AI chips” and are
specifically designed for AI tasks. And just to provide a commercial context, the AI chip market is
currently valued at around $7 billion, but is forecast for phenomenal growth to more than $90
billion in the next four years, according to a study by Allied Market Research.
1. Alphabet
Google’s parent company is overseeing the development of artificial intelligence technologies in

a variety of sectors, including cloud computing, data centers, mobile devices, and desktop
computers. Probably most noteworthy is its Tensor Processing Unit, an ASIC specifically designed
for Google’s TensorFlow programming framework, used mainly for machine learning and deep
learning, two branches of AI.
Google’s Cloud TPU is a data center or cloud solution and is about the size of a credit card, but
the Edge TPU is smaller than a one-cent coin and is designed for “edge” devices, referring to
devices at the edge of a network, such as smartphones and tablets and machines used by the
rest of us, outside of data centers.
2. Apple
Figure 2 powerful A14 Bionic chip, a5 nm chipset
Apple, at their latest event ‘Time Flies’, introduced an all-new iPad Air that houses a powerful
A14 Bionic chip, a 5 nm chipset. This makes the iPad Air, the world’s first device to operate on a
5nm chip. “We’re excited to introduce Apple’s most powerful chip ever made, the A14 Bionic,”
said Greg Joswiak, Apple’s senior vice president of Worldwide Marketing. Traditionally, Apple
would launch new chipsets with iPhones. Instead, the A14 Bionic was announced alongside the
new iPad Air. In the last two events, Apple has been explicit about how serious they are about
machine learning-based SoC.
Apple has been developing its own chips for some years and could eventually stop using suppliers
such as Intel, which would be a huge shift in emphasis. But having already largely disentangled
itself from Qualcomm after a long legal wrangle, Apple does look determined to go its own way
in the AI future. The company has used its A11 and A12 “Bionic” chips in its latest iPhones and
iPads. The chip uses Apple’s Neural Engine, which a part of the circuitry that is not accessible to
third-party apps.
The A12 Bionic chip is said to be 15 percent faster than its previous incarnation, while using 50
percent of the power. The A13 version is in production now, according to Inverse, and is likely to
feature in more of the company’s mobile devices this year. And considering that Apple has sold
more than a billion mobile devices, that’s a heck of a ready-made market, even without its
desktop computer line, which still only accounts for only 5 percent of the overall PC market
worldwide.
3. Huawei
Huawei Technologies has
officially unleashed its artificial
intelligence (AI) chip Ascend
910, which it says has a
maximum power consumption
of just 310W -- this is lower than
its originally planned specs of
350W. The chip is touted as
having "more computing power
than any other AI processor",
delivering 256 teraflops at half-
precision floating point (FP16)
and 512 teraflops for integer
precision calculations. Figure 3 Huawei Artificial intelligence (AI) chip Ascend 910
The Chinese tech giant also announced the commercial availability of its Mind Spore AI
computing framework, which it said was designed to ease the development of AI applications
and improve the efficiencies of such tools. Huawei said the AI framework handled only gradient
and model data that already had been processed, so user privacy could be maintained.
The platform also had "built-in protection technology" to keep AI models secured. Mind Spore
supports various platforms including edge, cloud, and devices, and is touted to work on a design
concept that enables developers to more easily and quickly train their models. "In a typical neural
network for natural language processing (NLP), Mind Spore has 20% fewer lines of core code than
leading frameworks on the market, and it helps developers raise their efficiency by at least 50%,"
Huawei said.
4 Intel
The world’s largest chipmaker was reported to have

been generating $1 billion in revenue from selling AI
chips as far back as 2017. Actually, Intel is not
currently the world’s largest chipmaker, but it
probably was at the time. And the processors being
considered in that report were of the Xeon range,
which is not actually AI-specific, just a general one
that was enhanced to deal with AI better. While it
may continue to improve Xeon, Intel has also
developed an AI chip range called “Nervana”, which
are described as “neural network processors”.
Artificial neural networks mimic the workings of the
human brain, which learns through experience and
example, which is why you often hear about
machine and deep learning systems needing to be
“trained”. With Nervana, scheduled to ship later this
year, Intel appears to be prioritizing solving issues
relating to natural language process and deep
learning.
Figure 4 Intel Nervana AI Chipset
Intel field programmable gate arrays (FPGAs) are

blank, modifiable canvases. Their purpose and power
can be easily adapted again and again for any number
of workloads and a wide range of structured and
unstructured data types.
FPGAs provide benefits to designers of many types of
electronic equipment, ranging from smart energy
grids, aircraft navigation, automotive driver's
assistance, medical ultrasounds and data center
search engines – just to name a few.
Figure 5 Intel FPGA Chipset

5 Nvidia
Nvidia unveils monstrous A100 AI chip
with 54 billion transistors and 5
petaflops of performance.
NVIDIA DGX™ A100 is the universal
system for all AI workloads, offering
unprecedented compute density,
performance, and flexibility in the
world’s first 5 peta FLOPS AI system.
NVIDIA DGX A100 features the world’s
most advanced accelerator, the NVIDIA
A100 Tensor Core GPU, enabling
enterprises to consolidate training,
inference, and analytics into a unified,
easy-to-deploy AI infrastructure that
includes direct access to NVIDIA AI
experts.
Figure 6 Nvidia DGX A100 AI Chipset
Nvidia unwrapped its Nvidia A100 artificial intelligence chip, and CEO Jensen Huang called it the
ultimate instrument for advancing AI. Huang said it can make supercomputing tasks — which are
vital in the fight against COVID-19 — much more cost-efficient and powerful than today’s more
expensive systems.
The chip has a monstrous 54 billion transistors (the on-off switches that are the building blocks
of all things electronic), and it can execute 5 petaflops of performance, or about 20 times more
than the previous-generation chip Volta. Huang made the announcement during his keynote at
the Nvidia GTC event, which was digital this year.
In the market for GPUs, which we mentioned can process AI tasks much faster than all-purpose
chips, Nvidia looks to have a lead. Similarly, the company appears to have gained an advantage
in the nascent market for AI chips.
The two technologies would seem to be closely related to each other, with Nvidia’s advances in
GPUs helping to accelerate its AI chip development. In fact, GPUs appear to underpin Nvidia’s AI
offerings, and its chipsets could be described as AI accelerators.
The specific AI chip technologies Nvidia supplies to the market include its Tesla chipset, Volta,
and Xavier, among others. These chipsets, all based on GPUs, are packaged into software-plus-
hardware solutions that are aimed at specific markets. Xavier, for example, is the basis for an
autonomous driving solution, while Volta is aimed at data centers.
AI Chips types
AI chips include three classes: graphics processing units (GPUs), field programmable gate arrays
(FPGAs), and application-specific integrated circuits (ASICs).
1. GPU (graphics processing unit)

Graphics processing technology has evolved to deliver unique benefits in the world of computing.
The latest graphics processing units (GPUs) unlock new possibilities in gaming, content creation,
machine learning, and more. The graphics processing unit, or GPU, has become one of the most
important types of computing technology, both for personal and business computing. Designed
for parallel processing, the GPU is used in a wide range of applications, including graphics and
video rendering. Although they’re best known for their capabilities in gaming, GPUs are becoming
more popular for use in creative production and artificial intelligence (AI).
GPUs were originally designed to accelerate the rendering of 3D graphics. Over time, they
became more flexible and programmable, enhancing their capabilities. This allowed graphics
programmers to create more interesting visual effects and realistic scenes with advanced lighting
and shadowing techniques. Other developers also began to tap the power of GPUs to
dramatically accelerate additional workloads in high performance computing (HPC), deep
learning, and more.
GPU vs. Graphics Card: What’s the Difference?
While the terms GPU and graphics card (or video card) are often used interchangeably, there is
a subtle distinction between these terms. Much like a motherboard contains a CPU, a graphics
card refers to an add-in board that incorporates the GPU. This board also includes the raft of
components required to both allow the GPU to function and connect to the rest of the system.
GPUs come in two basic types: integrated and discrete. An integrated GPU does not come on its
own separate card at all and is instead embedded alongside the CPU. A discrete GPU is a distinct
chip that is mounted on its own circuit board and is typically attached to a PCI Express slot.
What Are GPUs Used For?
Two decades ago, GPUs were used primarily to accelerate real-time 3D graphics applications,
such as games. However, as the 21st century began, computer scientists realized that GPUs had
the potential to solve some of the world’s most difficult computing problems.
This realization gave rise to the general-purpose GPU era. Now, graphics technology is applied
more extensively to an increasingly wide set of problems. Today’s GPUs are more programmable
than ever before, affording them the flexibility to accelerate a broad range of applications that
go well beyond traditional graphics rendering.
GPUs for Gaming
Video games have become more computationally intensive, with hyper realistic graphics and
vast, complicated in-game worlds. With advanced display technologies, such as 4K screens and
high refresh rates, along with the rise of virtual reality gaming, demands on graphics processing
are growing fast. GPUs are capable of rendering graphics in both 2D and 3D. With better graphics
performance, games can be played at higher resolution, at faster frame rates, or both.
GPUs for Video Editing and Content Creation
For years, video editors, graphic designers, and other creative professionals have struggled with
long rendering times that tied up computing resources and stifled creative flow. Now, the parallel
processing offered by GPUs makes it faster and easier to render video and graphics in higher-
definition formats. In addition, modern GPUs have dedicated media and display engines, which
allow for much-more-power-efficient video creation and playback.
GPU for Machine Learning
Some of the most exciting applications for GPU technology involve AI and machine learning.
Because GPUs incorporate an extraordinary amount of computational capability, they can deliver
incredible acceleration in workloads that take advantage of the highly parallel nature of GPUs,
such as image recognition. Many of today’s deep learning technologies rely on GPUs working in
conjunction with CPUs.
Intel® GPU Technologies
Intel has long been a leader in graphics processing technology, especially when it comes to PCs.
Most recently, the Intel® Iris® Xe graphics and Intel® UHD Graphics that are integrated into our
11th Gen Intel® Core™ processors support 4K HDR, 1080p gaming, and other rich visual
experiences.
1. Intel Iris Xe graphics
To start, Iris Xe is the name for the new integrated graphics
that will be built into the upper tier of the company's 11th
generation laptop chips. Intel introduced nine new CPUs in
this first wave of Tiger Lake silicon, and Iris Xe will be included
in the Core i5 and i7 processors in this new fleet. CPUs like the
peak-model Core i7-1185G7 and the midfield Core i5-1135G7
will get it, as you can see in this summary of the new chips. AI-
Enhanced The Intel® Deep Learning Boost-powered AI engine
means less waiting and more creating—from video/photo
enhancing to style transfer to green screen. Gaming! Gamers
can play fast and hard with new Intel Iris Xe graphics featuring
up to 1080p 60FPS for more detailed, immersive gaming. Figure 7 INTEL Iris XE
2. Intel® UHD Graphics
The Intel UHD Graphics is a graphics processor integrated into select Intel Core processors for
laptops. More precisely, in the 10th Gen Intel Core U-series processors, such as the i3-10110U,
i5-10210U, and i7-10510U. The UHD isn’t designed for hard core gaming. It rather provides basic
performance for managing the notebooks’ screens and some light gaming.
2. Field programmable gate arrays (FPGAs)

As it is mentioned above here is just few more
things about FPGAs. A field-programmable gate
array is an integrated circuit designed to be
configured by a customer or a designer after
manufacturing hence the term "field-
programmable". Essentially, an FPGA is a
hardware circuit that a user can program to
carry out one or more logical operations. Taken
a step further, FPGAs are integrated circuits, or
ICs, which are sets of circuits on a chip—that’s
the “array” part. Those circuits, or arrays, are
groups of programmable logic gates, memory,
or other elements. Figure 8 FPGA physical Design
Why Use an FPGA?
You might use an FPGA when you need to optimize a chip for a particular workload, or when you
are likely to need to make changes at the chip level later on. Uses for FPGAs cover a wide range
of areas—from equipment for video and imaging, to circuitry for computer, auto, aerospace, and
military applications, in addition to electronics for specialized processing and more. FPGAs are
particularly useful for prototyping application-specific integrated circuits (ASICs) or processors.
An FPGA can be reprogrammed until the ASIC or processor design is final and bug-free and the
actual manufacturing of the final ASIC begins. Intel itself uses FPGAs to prototype new chips.
The New Frontier for FPGAs: Artificial Intelligence
Today, FPGAs are gaining prominence in another field: deep neural networks (DNNs) that are
used for artificial intelligence (AI). Running DNN inference models takes significant processing
power. Graphics processing units (GPUs) are often used to accelerate inference processing, but
in some cases, high-performance FPGAs might actually outperform GPUs in analyzing large
amounts of data for machine learning.
FPGA Architecture
The general FPGA architecture
consists of three types of modules.
They are I/O blocks or Pads, Switch
Matrix/ Interconnection Wires and
Configurable logic blocks (CLB). The
basic FPGA architecture has two
dimensional arrays of logic blocks with
a means for a user to arrange the
interconnection between the logic
blocks. The functions of an FPGA
architecture module are discussed
below: Figure 9 FPGA Architecture
• CLB (Configurable Logic Block) includes digital logic, inputs, outputs. It implements the
user logic.
• Interconnects provide direction between the logic blocks to implement the user logic.
• Depending on the logic, switch matrix provides switching between interconnects.
• I/O Pads used for the outside world to communicate with different applications.
Figure 10 Logical internal Design of FPGA
The basic building block of the FPGA is the Look Up Table based function generator. The number
of inputs to the LUT vary from 3,4,6, and even 8 after experiments. Now, we have adaptive LUTs
that provides two outputs per single LUT with the implementation of two function generators.
Xilinx Virtex-5 is the most popular FPGA, that contains a Look up Table (LUT) which is connected
with MUX, and a flip flop as discussed above. Present FPGA consists of about hundreds or
thousands of configurable logic blocks. For configuring the FPGA, Modalism and Xilinx ISE
software’s are used to generate a bitstream file and for development.
3. Application-specific integrated circuits (ASICs)
ASICs allow implementation of DSP algorithms in dedicated, fixed-function logic to minimize

power consumption and hardware size at ultimate levels.
Figure 11 Application-specific integrated circuits (ASICs)
An Application-Specific Integrated Circuit (ASIC) is a device whose function is determined by

design engineers to satisfy the requirements of a particular application. A formal classification of
ASICs tends to become a bit fluffy around the edges. However, it is generally accepted that there
are four major categories of these little rapscallions: Gate Array (GA) devices (including sea-of-
gates), Structured ASICs, Standard Cell (SC) components, and Full Custom parts. In the early days
of digital integrated circuits, there were really only two main classes of devices.
The first were relatively simple building-block-type components that were created by companies
like TI and Fairchild and sold as standard off-the-shelf parts to anyone who wanted to use them.
The second were full-custom ASICs like microprocessors, which were designed and built to order
for use by a specific company. Gate arrays are based on the concept of a basic cell, which consists
of a selection of components; each ASIC vendor determines the numbers and types of
components provided in their particular basic cell. The first types of gate arrays were called
channeled gate arrays.
The main disadvantages of gate arrays are their somewhat lower density and their performance.
The concept of “Structured ASICs” sputtered into life around the beginning of the 1990s. The real
advantages of structured ASICs are reduced design time, effort, and cost. Furthermore, the act
of designing structured ASICs is much simpler than for a full-blown ASIC, because the majority of
the signal integrity, power grid, and clock-tree distribution issues have already been addressed
by the structured ASIC vendor.
FPGA vs. ASIC Designs
ASIC engineers often make use of latches in their designs. As a general rule-of-thumb, if you are
designing an FPGA, and you are tempted to use a latch, don't! Flip-flops with both “Set” and
“Reset” Inputs Many ASIC libraries offer a wide range of flip-flops, including a selection that offer
both set and reset inputs (both synchronous and asynchronous versions are usually available).
By comparison, FPGA flip-flops can usually be configured with either a set input or a reset input.
In this case, implementing both set and reset inputs requires the use of a LUT, so FPGA design
engineers often try to work around this and come up with an alternative implementation.
Global Resets and Initial Conditions Every register in an FPGA is programmed with a default initial
condition (that is, to contain a logic 0 or a logic 1). Furthermore, the FPGA typically has a global
reset signal that will return all of the registers (but not the embedded RAMs) to their initial
conditions. ASIC designers typically don't implement anything equivalent to this capability.
Advantages of ASICs over FPGAs
ASICs have a number of advantages over FPGAs, depending on the system designer’s goals. ASICs,
for instance, permit fully custom capability for the system designer as the device is manufactured
to custom design specifications. Additionally, for very high-volume designs, an ASIC
implementation will have a significantly lower cost per unit. It is also likely that the ASIC will have
a smaller form factor since it is manufactured to custom design specifications. ASICs will also
benefit from higher potential clock speeds over their FPGA counterparts.
A corresponding FPGA implementation, on the other hand, will typically have a faster time to
market as there is no need for layout of masks and manufacturing steps. FPGAs will also benefit
from simpler design cycles over their ASIC counterparts, due to software development tools that
handle placement, routing, and timing restrictions. FPGAs also benefit from being
reprogrammable, in that a new bit stream can quickly be uploaded, during system development
as well as when deployed in the field. This is one large advantage over the ASIC counterparts.
The Value of State-of-the-Art AI Chips
Leading node AI chips are increasingly necessary for cost-effective, fast training and inference of
AI algorithms. This is because they exhibit efficiency and speed gains relative to state-of-the-art
CPUs and trailing node AI chips. And, as discussed in subsection A, efficiency translates into
overall cost-effectiveness in chip costs—which are the sum of chip production costs (i.e. design,
fabrication, assembly, test, and packaging costs). Finally, cost and speed bottleneck training and
inference of many compute-intensive AI algorithms, necessitating the most advanced AI chips for
AI developers and users to remain competitive in AI R&D and deployment.
Table 1: Comparing state-of-the-art AI chips to state-of-the-art CPUs
Figure 12: Cost of AI chips over time for different nodes

U.S. and Chinese AI Chips and Implications for National Competitiveness
Cost-effectiveness and speed of leading node AI chips matter from a policy perspective. U.S.
companies dominate AI chip design, with Chinese companies far behind in AI chip designs, reliant
on U.S. EDA software to design AI chips, and needing U.S. and allied SME and fabs to fabricate AI
chips based on these designs.
The value of state-of-the-art AI chips, combined with the concentration of their supply chains in
the United States and allied countries, presents a point of leverage for the United States and its
allies to ensure beneficial development and adoption of AI technologies.108 U.S. companies
Nvidia and AMD have a duopoly over the world GPU design market, while China’s top GPU
company, Jingjia Microelectronics, fields dramatically slower GPUs.109 Likewise, U.S. companies
Xilinx and Intel dominate the global FPGA market, while China’s leading FPGA companies Center
for Security and Emerging Technology | 28 Efinix, Gowin Semiconductor, and Shenzhen Pango
Microsystem have only developed trailing node FPGAs thus far.
Table 2: Leading U.S. and Chinese AI chips
Thank you:
Best Regards: Mirza Mansab Baig
MS scholar in Xiamen university China
View publication stats

Developing Trends of AI Chips

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Developing Trends of AI Chips

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Developing Trends of AI Chips (A Technical Overview)

Technical Report · January 2021

opinion mining using Transformers and Transfer learning View project

Artificial intelligence Chips (AI_Chips) View project

The user has requested enhancement of the downloaded file.

Prepared by: Mirza Mansab Baig

1. Why Cutting-Edge AI Chips are Necessary for AI.

The first one is called FPGA (Field-programmable gate array).

Figure 1 Island-style global FPGA architecture.

4. AI chips with everything

Google’s parent company is overseeing the development of artificial intelligence technologies in

Figure 2 powerful A14 Bionic chip, a5 nm chipset

The world’s largest chipmaker was reported to have

Intel field programmable gate arrays (FPGAs) are

Figure 5 Intel FPGA Chipset

1. GPU (graphics processing unit)

2. Field programmable gate arrays (FPGAs)

Why Use an FPGA?

Figure 10 Logical internal Design of FPGA

ASICs allow implementation of DSP algorithms in dedicated, fixed-function logic to minimize

Figure 11 Application-specific integrated circuits (ASICs)

An Application-Specific Integrated Circuit (ASIC) is a device whose function is determined by

Figure 12: Cost of AI chips over time for different nodes

View publication stats

You might also like