You are on page 1of 45

OVER VIEW OF

MICROPROCESSOR
SYSTEMS

L.ANJANEYULU
Dept of ECE
N.I.T., Warangal

NITW/ECE/LA 1
ENIAC-on-a-Chip
Moore School of Electrical Engineering, University of
Pennsylvania http://www.ee.upenn.edu/~jan/eniacproj.html

NITW/ECE/LA 2
Intel
1950's: Shockley leaves Bell Labs to establish Shockley Labs in
California. Some of the best young electronic engineers and solid-
state physicists come to work with him. These include Robert Noyce
and Gordon Moore.
1969: Intel was a tiny start-up company in Santa Clara, headed by
Noyce and Moore.
1970: Busicom placed an order with Intel for custom calculator chips.
Intel had no experience of custom-chip design and sets outs to design a
general-purpose solution.
1971: Intel have problems translating architectures into working chip
designs - the project runs late.
Faggin joins Intel and solves the problems in weeks.
The result is the Intel 4000 family (later renamed MCS-4,
Microcomputer System 4-bit), comprising the 4001 (2k ROM), the
4002 (320-bit RAM), the 4003 (10-bit I/O shift-register) and the 4004,
a 4-bit CPU.

NITW/ECE/LA 3
Intel 4004
Introduced in 1971, the Intel
4004 "Computer-on-a-Chip" was
a 2300 transistor device capable
of performing 60,000 operations
per Second.

It was the first-ever single-chip


microprocessor and had
approximately the same
performance as the 18,000
vacuum tube ENIAC. The 4-bit
Intel C4004 ran at a Clock Speed
of 108 Kilo Hertz.

NITW/ECE/LA 4
The Intel 4004
Federico
Faggin
designed the
Intel 4004
processor.
His initials
were printed
on the
circuit.

NITW/ECE/LA 5
Intel 4004 – First Microcomputer

http://uk.geocities.com/magoos_universe/4004_main.htm
NITW/ECE/LA 6
The Busicom Calculator

The Busicom calculator


used five Intel 4001’s, two
4002’s, three 4003’s and
the 4004 CPU

The original engineering prototype


of the Busicom desk-top printing
calculator, the world’s first
commercial product to use a
microprocessor.
http://www.computerhistory.org/exhibits/highlights/busicom.shtml

NITW/ECE/LA 7
Intel 8008
1972: Faggin begins work on an 8-bit processor, the
Intel 8008. The prototype has serious problems with
electrical charge leaking out of its memory circuits.
Device physics, circuit design and layout are important
new skills. The 8008 chip layout is completely redesigned
and the chip is released.
There is a sudden surge in microprocessor interest.
Intel's 8008 is well-received, but system designers want
increased speed, easier interfacing, and more I/O and
instructions. The improved version, produced by Faggin,
is the 8080.
Faggin leaves Intel to start his own company Zilog, who
later produce the Z80.
NITW/ECE/LA 8
Federico Faggin : Zilog
Zilog produced the
3.5MHz Zilog Z80 (a very
popular processor taught
in many universities)
… and, later, a 16-bit Z8000.
Another great design but
Zilog struggled to provide
good support, they were a
new and inexperienced
company and had only a
few hundred employees;
at this time Intel had over
10 thousand.
NITW/ECE/LA 9
The Zilog Z80
The Z80 microprocessor is an 8 bit
CPU with a 16 bit address bus capable
of direct access of 64k of memory
space.

It was based on the 8080; it has a large


instruction set.

Programming features include an


accumulator and six eight bit registers
that can be paired as 3-16 bit registers.
In addition to the general registers, a
stack-pointer, program-counter, and
two index (memory pointers) registers
are provided.
NITW/ECE/LA 10
Early Microcontrollers
1974: Motorola (originally car radio manufacturers) had
introduced transistors in the 1950s and decided to make a late
but serious effort in the microprocessor market. They
announced their 8-bit 6800 processor. Though bulky, and
fraught with production problems, their 6800 had a good
design.
1975: General Motors approach Motorola about a custom-built
derivative of the 6800. Motorola's long experience with
automobile manufacturers pays off and Ford follow GM's lead.

1976: Intel introduce an 8-bit microcontroller, the MCS-48.


They ship 251,000 in this year.

1980: Intel introduce the 8051, an 8-bit microcontroller with


on-board EPROM memory. They ship 22 million and 91
million in 1983.
NITW/ECE/LA 11
The Intel 8086(1978)
29,000 Transistors
Clock Speeds: 5, 8 and 10 MHz
Approx. 10 times the
performance of the 8080 Intel
8086, 16 bit “assembly-language
compatible” extension of the
8080 architecture. 1978.
All registers 16 bits wide.
Additional registers all have
dedicated uses.
Extended Accumulator
architecture.
IBM selects the 8088, an 8086
with an 8 bit external bus, as the
processor for the IBM PC. Early
1980
NITW/ECE/LA 12
Early Computers
1979: Motorola also announce a 16-bit
68000. Indisputably, the best
microprocessor on the market. It would be
used in the Apple Macintosh launched in
1984.

Intel look seriously at the competition


(Motorola and Zilog) and implement
'Operation CRUSH' - a huge campaign
with a focused and trained work force
providing customer support, complete
solutions and long-term product support.

CRUSH proves an excellent strategy and


the 8086 becomes the de facto standard.
This success helps finance additions to their
product range, one of which is the bus The early Apple Macintosh

width reduced 8088, a 16-bit (8-bit bus)


microprocessor.

NITW/ECE/LA 13
The IBM PC
1981: IBM, having seen Apple's success recognise
a new personal computer market. They choose Intel
over Motorola and Zilog (and their own proprietary
processors) because of Intel's long-term commitment
to the 8086 line.
IBM selects the Intel 8088 for their PC, introduced in
August.
Intel bring out the 16-bit 80286 for the IBM PC AT
but it has weaknesses, most notably in virtual memory
support. The newest 'killer' application software,
Microsoft Windows, needs a more powerful
processor.

NITW/ECE/LA 14
Contemporary Microprocessors:

16/32-bit Processors
(external 16-bit Bus, internal 32 Bit
Structure)
Motorola MC68010
National Semiconductor NS16032
Additional Functionality on the Chip
Direct Memory Access (DMA) (Intel 80186)
Virtual memory management
(MC68010, Intel 80286)
Optional Coprocessor (Intel 8086/80286,
NS16032)
Extended Address Space
NITW/ECE/LA 15
Microprocessor History

32-bit Processors
CISC Processors
• Motorola MC680x0
• Intel i386 / i486 / Pentium
• National Semiconductor NS32x32
• Concept of a Processor Family
• Binary Compatibility
• Compatible with 16 Bit Processors
RISC Processors
• Advanced Micro Devices Am29000 (~1987)
• Sun Microsystems SPARC
• MIPS technologies MIPS R2000 / MIPS R3000

NITW/ECE/LA 16
Moore’s Law

Dr. Gordon E. Moore


co-founded Intel in
1968.

His observation that


number of transistors
doubled every 2 years
became known as
“Moore’s Law” NITW/ECE/LA 17
Pentium Evolution (1)
8086
much more powerful
16 bit
instruction cache, prefetch few instructions
8088 (8 bit external bus) used in first IBM PC
80286(1982) :
Added new instructions to support memory management.
Added memory mapping and multilevel protection scheme
Added real addressing mode to support legacy 8086 code.
16 Mbyte memory addressable (up from 1Mb) Include:
Segment limit checking,
Read-only and execute-only segment options,
Up to four privilege levels to protect operating system code (in
several subdivisions, if desired) from application or user programs.
Hardware task switching and local descriptor tables allow the operating
system to protect application or user programs from each other.

NITW/ECE/LA 18
80386(1985) IA-32 architecture family.
Support for multitasking
Additional registers (segment pointers).
All GP registers now 32 bits.
Address space now 32 bits with several new addressing
modes. Provides logical address space for each
software process
Added paging support under existing segmented
architecture. Supports:Segmented-memory model
and
“Flat” one-memory model
Almost a general purpose register machine.
Intel386 Processor Includes 6 Parallel Stages. Bus
interface unit , Code prefetch unit ,Instruction decode
unit , Execution unit , Segment unit , Paging unit

NITW/ECE/LA 19
Intel486 processor
more parallel execution capability than Intel386
instruction decode and execution units in five pipelined stages,
each stage (when needed) operates in parallel with the others on up
to five instructions in different stages of execution.
Each stage can do its work on one instruction in one clock, and so
the Intel486 processor can execute as rapidly as one instruction per
clock cycle.
8-KByte on-chip first level cache to increase the percent of instructions
that could execute at the scalar rate of one per clock:
Memory access instructions included if the operand was in the first-level
cache.
Integrated the x87 floating point unit onto the processor
New pins, bits and instructions to support more complex and powerful
systems
Second-level cache support
Multiprocessor support.
Power management (for notebooks and laptops
NITW/ECE/LA 20
Speeding it up
Pipelining
On board cache
On board L1 & L2 cache
Branch prediction
Data flow analysis
Speculative execution

NITW/ECE/LA 21
Pentium Evolution (2)
80486
sophisticated powerful cache and instruction pipelining
built in maths co-processor
Pentium
Superscalar CPU
Multiple instructions executed in parallel + L1 Cache

Pentium Pro
Increased superscalar organization
Aggressive register renaming
branch prediction
CPU
data flow analysis L2 Cache
+ L1 Cache
speculative execution

NITW/ECE/LA 22
Pentium Evolution (3)
Pentium II (1997)
Pentium Pro + MMX (MultiMedia eXtensions)
- Data Bus (64bit), Address Bus (36bit)
- L1 Cache: 32K Byte, L2 Cache: 512K Byte Cache
- Processor Core Speed (450MHz - 233MHz)
- System Bus (100MHz)
graphics, video & audio processing
Celeron = Pentium II - L2Cache - Celeron A :
L2Cache(128KByte)
Xeon = Pentium II + Graphic Accelerator + .. (Server용 CPU)
(1998)
Scalability : can be scaled to 2, 4, 8 or more,
and used for high-end server and workstations

NITW/ECE/LA 23
Pentium III(1999)

- Data Bus (64bit), Address Bus (36bit)


- Processor Core Speed (1.1GHz - 450MHz)
- System Bus (133MHz)
- Cache Speed Upgrade (Advanced Transfer Cache)
- 70 New Extended Instructions (SIMD)
70 new streaming SIMD extensions (SSE) :
50 to improve floating-point performance
12 to improve multimedia processing
8 to improve the efficiency of L1 cache
- Pentium III Xeon Processor
Additional floating point instructions for 3D
graphics

NITW/ECE/LA 24
Pentium 4 (2000)
Further floating point and multimedia enhancements
- Data Bus (64bit), Address Bus (36bit)
- Processor Core Speed (2GHz - 3.2GHz)
- System Bus (400MHz-800MHz)

- 800 MHz : Pentium 4 C


-3.20 GHz, 3 GHz, 2.80 GHz, 2.60 GHz, 2.40 GHz
- 533 MHz : Pentium 4 B
3.06 GHz, 2.80 GHz, 2.66 GHz, 2.53 GHz, 2.40 GHz, 2.26 GHz
- 400 MHz : Pentium 4 A
2.60 GHz, 2.50 GHz, 2.40 GHz, 2.20 GHz, 2 GHz
-hyper-threading technology

Itanium
64 bit
See Intel web pages for detailed information on processors

NITW/ECE/LA 25
Technological Development
Model Year # of transistors
4004 1971 2250
8008 1972 2500
8080 1974 5000
8086 1978 29000
80286 1982 120000
80386 1985 275000
80486 1989 1180000
Pentium 1993 3100000
Pentium-II 1997 7500000
Pentium-III 1999 24000000
Pentium 4 2000 42000000
NITW/ECE/LA 26
Technological Development
Pentium 4
100000000 Pentium III
Pentium II
# of transistors

10000000
80486 Pentium
1000000 80386

100000 80286

8086
10000
8080
1000
4004
8008
71
72
74
78
82
85
89
93
97
99
00
19
19
19
19
19
19
19
19
19
19
20
Year

NITW/ECE/LA 27
Performance
1970s Processors:
4004 8008 8080 8086 8088
Introduced 1971 1972 1974 1978 1979

Clock 108 KHz 108 KHz 2 MHz 5 MHz, 5 MHz,


Speeds 8MHz, 8MHz
10MHz
Bus Width 4 bits 8 bits 8 bits 16 bits 8 bits

Number of 2300 3500 6000 29,000 29,000


Transistors
Addressable 640 bytes 16 KBytes 64 KBytes 1 MB 1 MB
Memory
Virtual -- -- -- -- --
Memory

NITW/ECE/LA 28
Performance
1980s Processors:

80286 386TM DX 386TM SX 486TM DX


CPU
Introduced 1982 1985 1988 1989

Clock 6 MHz – 16 MHz-33 16 MHz-33 25 MHz- 50


Speeds 12.5 MHz MHz MHz MHz

Bus Width 16 bits 32 bits 16 bits 32 bits

Number of 134,000 275,000 275,000 1.2 million


Transistors
Addressable 16 MB 4 GB 4GB 4GB
Memory
Virtual 1 GB 64 TB 64 TB 64 TB
Memory
NITW/ECE/LA 29
Performance
1990s Processors:

486TM SX Pentium Pentium Pentium II

Introduced 1991 1993 1995 1997

Clock 16 MHz- 60 MHz – 150 MHz- 200 MHz-


Speeds 133MHz 166 MHz 200MHz 300MHz
Bus Width 32 bits 32 bits 64 bits 64 bits

Number of 1.185 million 3.1 million 5.5 million 7.5 million


Transistors
Addressable 4 GB 4 GB 64 GB 64 GB
Memory
Virtual 64 TB 64TB 64 TB 64 TB
Memory

NITW/ECE/LA 30
Performance
Recent Processors:
Pentium III Pentium 4

Introduced 1999 2000

Clock 450 MHz 1.3-1.8 GHz


Speeds
Bus Width 64 bits 64 bits

Number of 95 million 42 million


Transistors
Addressable 64 GB 64 GB
Memory
Virtual 64 GB 64 TB
Memory

NITW/ECE/LA 31
Contemporary Microprocessor

64/32-bit Processors
SUN Microsystems SuperSPARC
Motorola 88110
IBM, Motorola PowerPC 601 (MPC601)
“Modern” Processors
64-bit Structure
Internal Parallelism
• Instruction pipelining
• Arithmetic Pipelining
Instruction and Data Caches
Advanced Memory and Peripheral
Connections
NITW/ECE/LA 32
Performance Mismatch
Processor speed increased
Memory capacity increased
Memory speed lags behind processor
speed!!

NITW/ECE/LA 33
DRAM and Processor
Characteristics

NITW/ECE/LA 34
Intel Itanium 2 (McKinley)
• 64bit Processor
• 221 million transistors!
How are they used?
• What will we do as
transistor counts
continue to grow?
• Most of chip is used for
memories, inst. decoding,
dynamic scheduling…
• Why is it done this way?
• How much more efficient
could it be if more of area
went to actual processing?

NITW/ECE/LA 35
Even More Recent Example
• Runs 64-bit
IA-64 ISA
• Die: 3.74 cm2
• .13µ process
• 410M transistors
• 1.5GHz core
• 1.3V logic
• 130W power
consumption!
• 6.4GB/s bus
• Cost: $2,247-
$4,226
• 9MB L3 cache
later this year…
NITW/ECE/LA 36
AMD Opteron (100 Million Transistors)

NITW/ECE/LA 37
NITW/ECE/LA 38
Cyrix III
• Developed by National Semiconductor
• 133 MHz Front Side Bus (although it supports 66 MHz, and 100 MHz FSB).
• 256 KB integrated L2 cache along with a 64 KB integrated L1 cache.
• 3dNow! SIMD instructions in a dual pipelined FPU.
• As with the MII, the Cyrix III supports MMX.
• superscalar design featuring two seven stage pipelines allowing two
processing streams to be processed simultaneously.
• two level translation buffer and a 512 entry branch target buffer.
• out-of-order execution through register renaming and data forwarding and
bypassing to resolve data dependencies between pipelines.
• Speculative execution after a predicted branch is also supported.
• 15% to 20% cheaper than a comparable Celeron.
• Hope to capture 10% of market.
• Subject of a lot of legal action by Intel but VIA is still in business.

NITW/ECE/LA 39
First Implementation of Key Features: Montecito
Core Core
Core 1 Core 2
L3 Cache L3 Cache

 Key Processor Features


 Intel’s first dual-core System Bus
processor 1MB L2I 2 Way
Multi-threading
 Intel’s first processor 90nm
with >1 billion transistors
 24 MB L3 cache Power
 Multi-threading Management/
Frequency
Dual-
 Compatible with existing core
Boost
(Foxton)
Itanium 2-based systems

 Targeting H2’2005 1.7 Billion 2x12MB L3


caches Arbiter
Transistors with
Pellston

Multiple cores, Multiple threads


and L3 Cache on ONE die

NITW/ECE/LA 40
Intel’s Latest: The Pentium 4 2.4GHz

478 pin packaging

NITW/ECE/LA 41
Selecting a Microprocessor
Issues
Technical: speed, power, size, cost
Other: development environment, prior expertise, licensing, etc.
Speed: how evaluate a processor’s speed?
Clock speed – but instructions per cycle may differ
Instructions per second – but work per instr. may differ
Dhrystone: Synthetic benchmark, developed in 1984.
Dhrystones/sec.
• MIPS: 1 MIPS = 1757 Dhrystones per second (based on
Digital’s VAX 11/780). A.k.a. Dhrystone MIPS. Commonly
used today.
 So, 750 MIPS = 750*1757 = 1,317,750 Dhrystones per
second
SPEC: set of more realistic benchmarks, but oriented to desktops
EEMBC – EDN Embedded Benchmark Consortium,
www.eembc.org
• Suites of benchmarks: automotive, consumer electronics,
networking, office automation, telecommunications
NITW/ECE/LA 42
Which has higher performance?
Time to do the task (Execution Time)
• execution time, response time, latency
Tasks per day, hour, week, sec, ns. .. (Performance)
• throughput, bandwidth
Response time and throughput often are in opposition
Response Time
Time to complete a task
Throughput
Total amount of work done per time
Execution Time (CPU Time)
User CPU time
• Time spent in the program
System CPU time
• Time spent in OS
Elapsed Time
Execution Time + Time of I/O and time sharing

NITW/ECE/LA 43
Performance evaluation

Criteria of Performance
Execution time seems to measure the power of the
CPU
Elapsed time measures the performance of whole
system including OS and I/O
User is interested in elapsed time
Sales people are interested in the highest number
of performance that can be quoted
Performance analysist is interested in both
execution time and elapsed time

NITW/ECE/LA 44
Coffee Time!

NITW/ECE/LA 45

You might also like