Professional Documents
Culture Documents
IAF0042
Arvo Toomsalu
Computer Architecture
Introduction
Computer Model
CORE
Processor subsystem
Memory subsystem
I/O-subsystem
Classical Architectures
Princeton or von Neumann architecture
System Bus
CPU
MEMORY
Data &
Instructions
Harvard architecture
MEMORY
I Bus
CPU
Instructions
D Bus
MEMORY
Data
MEMORY SYSTEM
Primary storage
Secondary storage
RAM
ROM
Magnetic
CAM
SRAM
DRAM
Molecular
RAM
ROM
PROM
(OTP)
RPROM
M tape
M disk
Optical
CD-ROM
WORM (CD-R)
Magneto-optical CD-RW
DVD
Hologram Optical Disc
Flash ROM
EEPROM
[UVROM]
Memory Hierarchy
etc.
MULTIPROCESSOR SYSTEMS
Flynn-Johnson taxonomy
SISD
SIMD
MISD
SINGLE
DATA
STREAM
MIMD
MULTIPLE
DATA
STREAM
SINGLE
MULTIPLE
INSTRUCTION INSTRUCTION
STREAM
STREAM
SISD Architecture
I- instructions; D data
I
CU
EU
I
MU
SIMD Architecture
I
I
EU
I
CU
MU
D
EU
I
MU
D
EU
MU
MISD Architecture
I
I
I
MU
CU
CU
CU
EU
EU
EU
MIMD Architecture
CU
EU
MU
CU
EU
MU
CU
EU
PR
MU
PR
PR
System interconnect
(bus, crossbar, network)
IO
SM
UMA model
SM
SM - Shared Memory
LM Local Memory
GSM
GSM
GSM
NUMA model
(cluster)
Cluster 1
Cluster n
CSM
PR
CSM
CIN
PR
PR
CIN
CSM
CSM
PR
10
UMA
CPU
Cache
CPU
Uniform
memory
latency
Cache
CPU
Cache
CPU
Cache
Interconnection Network
MEM
MEM
MEM
MEM
NUMA
Long memory latency
MEM
MEM
MEM
MEM
Short
local
memory
latency
CPU
Cache
CPU
Cache
CPU
Cache
CPU
Interconnection Network
11
Cache
Example
12
Summary
Processor Organization
Serial
Parallel
SISD
Uniprocessor
SIMD
Vector
processor
MISD
MIMD
Array
processor
Overlapped
Multi ALU
operations
Tightly
coupled
Loosely
coupled
Shared
memory
Distributed
memory
Clusters
Symmetric
multiprocessor
(SMP)
Nonuniform
memory access
(NUMA)
Literature
Arthur W. Burks, Herman H. Goldstine, John von Neumann. Preliminary Discussion of the
Logical Design of an Electronic Computing Instrument.
Arvutivrgus: http://www.cs.unc.edu/~adyilie/comp265/vonNeumann.html
13
Network Processors
Network processor is a programmable CPU chip that is optimized for networking and
communications functions.
Two common approaches (a, b) to parallelism in network processors:
a. Input packets are distributed among multiple processing units to divide the load.
b. Input packets flow through a pipeline of processing elements.
14
Graphics Processor
A graphics processor (video card, graphic accelerator card, display adapter) is a special
purpose microprocessor specifically designed to generate signals to drive a video
monitor.
In graphics applications, complex shapes and structures are formed through the
sampling, interconnection and rendering of more simple objects (primitives).
Graphics primitives may include lines, characters, areas (triangles and ellipses), and
shapes (polygons, spheres, cylinders and the like).
15
16
High performance:
Multiple small independent memory partitions for improved latency
Early culling and clipping, cull non-visible primitives at high rate;
Rasterization supports aliased and anti-aliasing and triangles, etc;
Z-Cull, allows high-speed removal of hidden surfaces;
Occlusion Query, keeps a record of the number of fragments passing or failing the
depth test and reports it to the CPU.
17
Multiprocessor architecture
Single-chip 3D graphics solution, which consists of:
Geometry Processor
Primitive Processor
Pixel Processor
18
Multimedia Processors
Multimedia is media and content that uses a combination of different content forms.
Multimedia is integration of multiple forms of media: text, graphics, audio, video,
communication etc.
19
20
21
The modern advanced dedicated multimedia processors use SIMD and VLIW
architectural schemes and their variations to achieve very high parallelism.
22
23
24
Example
Typical Application
o
o
o
o
o
o
o
Web tablets
Netbooks
Connected TVs
Portable infotainment
Digital media hubs
Point of service terminals
Video conferencing systems
Main Features
o
o
o
o
o
o
o
o
Performance
o
o
o
o
25
26
3. Associative controlling
27