You are on page 1of 33

LECTURE 1: REVIEW OF MICRO-

COMPUTER ARCHITECTURE AND


ORGANIZATION

Dr. ASSAF M. H.

EE326 - EMBEDDED SYSTEMS

USP JULY 2018


TOPICS

• Computer Architecture

• Technology Improvements

• Computer Generations

• Elements of Modern Computers

• Performance Indicators
COMPUTER ARCHITECTURE
Input/Output
and Storage Disks and Tape
Devices

RAM/DRAM

Memory
Cache
Hierarchy

Processor
Design Instruction Set Architecture

Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005


CONTEXT FOR DESIGNING
NEW ARCHITECTURES
Application Area
 Special Purpose / General Purpose
 Scientific / Commercial
Level of Software Compatibility Required
 Object Code / Binary Code
 Assembly / High Level Programming Language
Operating System Requirements
 Memory Management
 Interrupts
Technology Improvements
 Increased Processor Performance
 Larger Memory
 Faster buses/I/O devices
 Software/Compiler Innovations
TECHNOLOGY IMPROVEMENTS
CAPACITY SPEED

LOGIC 2x in 3 years 2x in 3 years


DRAM 4x in 3 years
1.4x in 10
years
I/O DEVICES 2x in 3 years 1.4x in 10
years
Speed increases of memory and I/O have not
kept pace with processor speed increases

Processor speed 66MHz – 3.2GHZ in 10 years

1 MegaHertz = 1 Million Hertz = 10**+6 Hertz


1 GigaHertz = 1000 MHertz = 1 Billion Hertz = 10**+9 Hertz
PROCESSOR PERSPECTIVE

Pentium III Cray YMP

Type Desktop Supercomputer


Year 2000 1988
Clock 1130 MHz 167 MHz
MIPS > 1000 MIPS < 50 MIPS
Cost US$ 2,000 US$ 1,000,000
Cache 256 KB 0.25 KB
Memory 512 MB 256 MB

Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005


THIS IMPROVED PERFORMANCE COMES
FROM?
Technology
 Higher transistors density per chip
 Faster logic blocks

Machine Organization
 Pipelining techniques

Instruction Set Architecture


 Reduced Instruction Set (i.e. RISC Computers)

Compiler technology
 Greater levels of optimization
Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005
Computer Generations

• Over the past 5 decades, electronic computers have


gone through 5 generations of development.

• Each successive generation is marked by sharp


changes in hardware and software technologies.

• With some exceptions, most of the features


introduced in earlier generations are carried
through to later generations.

Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005


Generation 0 (1642 – 1954)
• Mechanical

– Pascaline - Addition and subtraction (1642)


– Analytical Engine - Calculate general formulas under the control of a
program stored on punch cards (1834)
– Model K and Complex Number Calculator - Utilized
electromechanical relays. Bell Labs (1937, 1941)
– Havard Mark I (IBM Automatic Sequence Controlled Calculator) -
Electromechanical (relays) computer. The Mark I's program was read
from paper tape. Input was from punched cards, paper tape, or
switches and output was to typewriter or punched cards (1944)

Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005


First Generation (1945 – 1954)
• Technology and Architecture
– Vacuum tubes and relay memories
– CPU driven by a program counter (PC)
• Software and Applications
– Machine and assembly language
– Single user at a time
• Representative systems: ENIAC, Princeton IAS, IBM 701
- ENIAC - Electronic Numerical Integrator and Calculator - Programmed via switches and
jumper cables and utilized 20 10-digit decimal registers - University of Pennsylvania
(1946)
- EDVAC - Electronic Discrete Variable Automatic Computer - similar to the EDSAC -
University of Pennsylvania (1951)
- UNIVAC I (UNIVersal Automatic Computer I)- First commercial computer. This computer
predicted the outcome of the 1952 presidential election of Eisenhower over Stevenson
with a sample of 1% of the voting population. Remington-Rand Corporation (1951)
- IBM 701 - IBM's first scientific computer (1953)

Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005


Second Generation (1955 – 1964)
• Technology and Architecture
– Discrete transistors
– Core memories
– Floating-point arithmetic
– I/O processors
– Register Transfer Language (RTL) developed
• Software and Applications
– High-level languages (HLL): FORTRAN, COBOL,
ALGOL introduced with compilers
– Still mostly single user at a time
• Representative systems: CDC 1604, UNIVAC LARC, IBM
7090
- TX-0 - The world's first experimental transistorized computer having 64K of 18-bit words
core memory - MIT Lincoln Lab (1956)
- IBM 7090 - The transistorized scientific computer having 32K of 36-bit words of magnetic
core storage (1959)
Third Generation (1965 – 1974)
• Technology and Architecture
– SSI/MSI Integrated
circuits
– Microprogramming
– Pipelining
– Cache memories
• Software and Applications
– Multiprogramming
– Time-sharing operating systems
– Multi-user applications
• Representative systems: IBM 360/370, CDC 6600, TI
ASC, DEC PDP-8
- IBM System/360 - machine supported multiprogramming - IBM floating point hardware
(1965)
- DEC PDP-11 - The PDP series was an extremely popular computer due to low cost and
good performance (1970)

Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005


Fourth Generation (1975 – 1990)
• Technology and Architecture
– LSI/VLSI circuits
– Semiconductor memory
– Multiprocessors/Vector supercomputers/Multi-computers
– Shared/Distributed memory
• Software and Applications
– Multiprocessing operating systems/Languages/Compilers
– Software tools created for parallel processing/distributed
computing
• Representative systems: VAX 9000, Cray X-MP, IBM 3090,
BBN TC2000
- Altair 8800 - the first kit-based personal computers based on the Intel 8080A chip (1975)
- Commodore PET (personal eductional tutor) - It was based on MOS Technology 6502
processor and had 4-8K of RAM (1977)
- IBM PC - Based on the Intel 8088 processor, the IBM PC revolutionized the personal computer
market. The IBM PC shipped with the PC-DOS operating system (1981)
- Apple Macintosh - The GUI operating environment (1984)
Fifth Generation (1991 – present)
• Technology and Architecture
– ULSI processors, memory, and switches
– High-density packaging
– Scalable architecture
– Fault tolerant architecture
– Optical technologies
• Software and Applications
– Massively parallel processing (MPP)
– Heterogeneous processing
• Representative systems: Fujitsu VPP500, Cray MPP,
TMC CM-5, Intel Paragon
- Palm 5000 - PDA's (1996)
- Embedded systems (computers)

Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005


Computing in the 21st Century
• Memory chips technology continue to quadruple
about every 3 years
• Single-chip multiprocessor systems
• High-speed communication networks (optical, etc.)
• Large disks approaching ~ 100 GB
• System-on-a-chip (20+ Million Transistors)
• Low Power
• Multimedia applications: Video, speech,
handwriting, virtual reality, …
• Embedded systems: microcontrollers, DSPs,
graphics processors, … (90% of computers manufactured
revenue)
These improvements will create the need for new and
innovative computer systems
Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005
Computer Processors
Personal Computer Server Embedded

Model

Pentium 4 UltraSPARC III Intel 8051

Date Nov 2000 - Feb 2005 Sep 2003 1980

Speed 1.4 GHz - 3.8 GHz 1.2 GHz 12 MHz - 100 MHz

Wordsize 32-bit 64-bit 8-bit

Transistors 42 million 29 million 60 thousand


4K ROM 128 Bytes
Memory 64 GB 16 GB
RAM
Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005
HARDWARE TECHNOLOGY

1980 1990 2000


Memory 64 KB 4 MB 256 MB
Clock Rate 1-2 MHz 20-40 MHz 700-1200 MHz
Hard disks 40 M 1G 40 G

Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005


Elements of Modern Computers

• Hardware, software, and programming elements of


modern computer systems are introduced below:

– Computing problems
– Algorithms and data structures
– Hardware resources
– Operating systems
– System software/compiler support

Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005


Computing Problems

• Numerical computing
– numerical problems in science and technology
– complex mathematical formulations
• Transaction processing
– alphanumerical problems in business and
government
– large database management
– information retrieval operations
• Logical Reasoning
– artificial intelligence (A I)
– Neural Network (NN)
– Fuzzy Logic (FL)
Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005
Algorithms and Data Structures

• Special algorithms and data structures are needed


to specify the computations and communications
involved in computing problems

• These often require interdisciplinary interactions


among theoreticians, experimentalists, and
programmers

Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005


Hardware Resources
• A modern computer system demonstrates its power
through coordinated efforts by hardware resources,
an operating system, and application software

• The architecture of a system is partly shaped by the


hardware resources:
• Processors form the hardware core of a
computer system
• Memory
• Peripheral devices

• In addition, software interface programs are needed


(device drivers, editors, …) to facilitate the
portability of user programs on different machine
architectures
Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005
Operating System

• Operating systems manage the


allocation/deallocation of resources during the
execution of user programs

• An OS plays a significant role in mapping hardware


resources to algorithmic and data structures

Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005


Software/Compiler Support

• Software/compiler support is needed for the


development of efficient programs in high-level
language (HLL)

• Compilers, assemblers, and loaders are traditional


tools for developing programs in high-level languages

• These tools determine the efficiency of hardware


utilization and the system’s programmability

Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005


Performance Indicators

• Turnaround time depends on:


– disk and memory accesses
– input and output operations
– compilation time
– operating system overhead
– CPU time

• Since I/O and system overhead frequently overlaps


processing by other programs, it is fair to consider
only the CPU time used by a program, and the user
CPU time is the most important factor

Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005


Clock Rate and CPI

• CPU is driven by a clock with a constant cycle time 


(usually measured in nanoseconds).
• The inverse of the cycle time is the clock rate
(f = 1/, measured in megahertz).
• The size of a program is determined by its instruction
count, Ic, the number of machine instructions to be
executed by the program.
• The cycle count (CC) = The number of CPU clock
cycles for executing a job.

• CPU time = CC x  = CC/f


• Different machine instructions require different
numbers of clock cycles to execute. CPI (cycles per
instruction) is thus an important parameter.
Average CPI
CPU clock cycles for the program
CPI = -------------------------------------------------
Ic
• To determine the average number of cycles per
instruction for a particular processor, the frequency of
occurrence of each instruction type is needed
• Any estimate is valid only for a specific set of
programs (which defines the instruction mix)
• The term CPI is used with respect to a particular
instruction set and a given program mix
• For a number of instruction categories: arithmetic,
logic, load, store, branch, etc.
Σ ni = 1 CPIi x Ii
CPI = ------------------------------------
Ic
Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005
Performance Factors
CPU time = Seconds = Instructions x Cycles x Seconds
Program Program Instruction Cycle

• The time required to execute a program containing Ic


instructions is just CPU time = Ic  CPI  

• Each instruction must be fetched from memory,


decoded, then operands fetched from memory, the
instruction executed, and the results stored

• The time required to access memory is called the


memory cycle time, which is usually k times the
processor cycle time . The value of k depends on
the memory technology and the processor-memory
interconnection scheme
Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005
Comparing CPU Time
– A 500 MHz Pentium III processor takes 2 ms to run a
program with 200,000 instructions.
– A 300 MHz UltraSparc processor takes 1.8 ms to run
the same program with 230,000 instructions.

– What is the CPI for each processor for this program?


CPI = Cycles / Instruction Count
= CPU time X Clock Rate / Instruction Count
CPIPentium = 2*10-3 X 500*106 / 2*105 = 5.00
CPISPARC = 1.8*10-3 X 300*106 / 2.3*105 = 2.35

– Which processor is faster and by how much?


The UltraSparc is 2/1.8 = 1.11 times as fast, or 11%
faster.
Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005
MIPS Rate
• Processor speed is often measured in terms of
millions of instructions per second, frequently
called the MIPS rate of the processor

Ic f f  Ic
MIPS rate   
T 10 6
CPI 10 6
C 10
T: Execution time

• The MIPS rate is directly proportional to the clock


rate and inversely proportion to the CPI

• All four system attributes (instruction set, compiler,


processor, and memory technologies) affect the
MIPS rate, which varies also from program to
program
Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005
MFLOP Rate

• The rate of floating-point instruction execution per


unit time (millions floating-point instruction per
second) is another measure for machines’
performance

Number of floating - pointoperationsin a program


MFLOP S
6
T  10

T: Execution time

Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005


PERFORMANCE MEAN

• Arithmetic mean (AM) and Geometric mean (GM)


are used to summarize performance regarding a
set of benchmark programs

n n
 Ti
1
AM   Ti GM  n
n i 1 i 1
Ti : Execution time

Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005


SPEED UP

• Speed up is a measure of how a machine performs


after some enhancement relative to its original
performance

Executiontime before enhancement


Speed Up 
Executiontime after enhancement

Fundamentals of Computer Organization and Architecture, Wiley Inter-Science 2005


Example - MIPS Ratings &
Performance measurement
Machine Clock Performance CPU Time
VAX 11/780 5 MHz 1 MIPS 12x seconds
IBM RS/6000 25 MHz 18 MIPS x seconds
• The instruction count on the RS/6000 is 1.5 times
that of the code on the VAX.

• Average CPI on the VAX is assumed to be 5.

• Average CPI on the RS/6000 is assumed to 1.39.

• VAX has typical CISC architecture.

• RS/6000 has typical RISC architecture.

You might also like