Professional Documents
Culture Documents
Fields of discussion:
Basic features of each model’s architecture,
Instruction handling,
Caches & Memory management,
Pipeline (# of stages, possible penalties).
Advanced Computer Architectures 1
Part I: AMD’s K5
- Basic Features
8 Operand Buses feed the Execution Units (4 units are fed with 2
operands on every cycle).
The Data
Cache is
divided into The Instruction Cache has a 16-
four banks. byte line size – a 16-byte buffer
There are 2 ensures compatibility with the
access ports Pentium bus which performs 32-byte
one for each bursts, so it
Load/Store holds the
Unit. second
Cache line.
The caches are virtually addressed and tagged to avoid the need to
translate addresses before a cache access.
A single set of physical tags is shared by instruction and data caches.
Thus, conflicts with the CPU for cache access are eliminated and ensure
consistency between instruction and data caches
10 stages of pipeline->
High clock rates
achievement,
Large L1 instruction -
data caches ->
Functions in systems
with of without
backside L2,
Astonishingly small
(184mm2), despite
transistor complexity
(22-million).
K7 connects to the
chip set via point-to-
point interconnect
instead of a shared bus.
This requires more
pins in MP
configurations but
allows the bus to run at
higher speed.
K7’s bus comprises
three separate ports:
address in, address out,
and a 72-bit
bidirectional data port.
It uses a five-state MOESI cache coherence protocol (‘owned’ state added)
K7 is the most
complex of any current
x86 processor.
It seems to
outperform Intel’s
models on an
instructions-per-clock
basis.
It promises AMD
performance
leadership, allowing to
increase both prices
and profit margins.
Advanced Computer Architectures 15
Part III: AMD’s Athlon (1st member of of the 7th-generation AMD-processors’
family)
AMD Seventh Generation Intel Previous Generation
Processor
Architecture/
Technology –
Competitive
Comparison
Advanced Computer Architectures 16
AMD Athlon Processor Microarchitecture Features
I. The industry's first nine-issue, superpipelined, superscalar x86 processor
microarchitecture designed for high clock frequencies
Multiple x86 instruction decoders
Three out-of-order, superscalar, fully pipelined floating point
execution units, which execute all x87 (floating point), MMX and
3DNow! Instructions
Three out-of-order, superscalar, pipelined integer units
Three out-of-order, superscalar, pipelined address calculation units
72-entry instruction control unit
Advanced dynamic branch prediction
II. High-performance cache architecture featuring an integrated 128KB L1
cache and a programmable, high-speed backside L2 cache interface
III. 200MHz AMD Athlon processor system bus (scalable beyond 400 MHz)
enabling leading-edge system bandwidth for data movement-intensive
applications
IV. Enhanced 3DNow! technology with new instructions to enable improved
integer math calculations for speech or video encoding and improved data
movement for Internet plug-ins and other streaming applications
Intel Previous
AMD Seventh Generation
Generation