You are on page 1of 60

COMPUTER EVOLUTION

CHAPTER # 2 Computer Organization & Architecture


History of Computers
S H E H E R YAR MALI K

 Computers are divided into three main generations


 First Generation - Vacuum tubes (1946 – 1957)
 Second Generation - Transistors (1958 – 1964)
 Third Generation – Integrated Circuits (1965 – now)
 After some major changes there is a variety of
computer types which are covered in later
generations

Chapter # 2 Computer Organization & Architecture 2


Generations of Computers
S H E H E R YAR MALI K

First Second Third


Gen. Gen. Gen.

Technology Vacuum Tubes Transistors Integrated Circuits


(multiple transistors)

Filled Whole
Size Filled half a room Smaller
Buildings

Chapter # 2 Computer Organization & Architecture 3


First Generation – Vacuum Tubes
S H E H E R YAR MALI K

The ENIAC (Electronic Numerical Integrator and


Computer) was unveiled in 1946: the first all-
electronic, general-purpose digital computer

Chapter # 2 Computer Organization & Architecture 4


Second Generation – Transistors
S H E H E R YAR MALI K

Chapter # 2 Computer Organization & Architecture 5


Third Generation – Integrated Circuits
S H E H E R YAR MALI K

Chapter # 2 Computer Organization & Architecture 6


Third Generation – Integrated Circuits
S H E H E R YAR MALI K

 Small scale integration (1965 – 1968)


 up to 100 devices on a chip
 Medium scale integration (1968 – 1971)
 100 - 3,000 devices on a chip
 Large scale integration (1971 – 1977)
 3,000 - 100,000 devices on a chip
 sometimes referred as fourth generation
 Very large scale integration (1978 – 1991)
 100,000 - 100,000,000 devices on a chip
 sometimes referred as fifth generation
 Ultra large scale integration (1991 – now)
 over 100,000,000 devices on a chip
Chapter # 2 Computer Organization & Architecture 7
First Generation – Vacuum Tubes
S H E H E R YAR MALI K

 ENIAC
 Electronic Numerical Integrator And Computer
 Worlds first general purpose computer
 Developed at University of Pennsylvania
 It was a response to the US army war time needs
 Trajectory tables for weapons
 Timeline
 Started 1943
 Finished 1946
 too late for war effort
 Used until 1955

Chapter # 2 Computer Organization & Architecture 8


ENIAC - details
S H E H E R YAR MALI K

 It was a decimal machine (not binary)


 arithmetic was performed in decimal system
 20 accumulators of 10 digits
 Memory consist of 20 accumulators each capable of containing 10-digit decimal
number
 At one time only one vacuum tube was in ON state representing one of
the 10-digits
 Programmed manually by switches
 It was programmed manually by setting switches and plugging & unplugging cables
 18,000 vacuum tubes
 30 tons
 15,000 square feet
 140 kW power consumption
 5,000 additions per second

Chapter # 2 Computer Organization & Architecture 9


Von Neumann/Turing Machine
S H E H E R YAR MALI K

 Program entry and alteration was too tedious in ENIAC


 John Von Neumann projected the idea of stored program
concept
 he and his colleges designed a computer (IAS) which is prototype of all
subsequent computers
 IAS Computer
 Developed at Princeton Institute for Advanced Studies
 Based on stored Program concept
 Completed in 1952
 It consist of;
 A main memory, which stores both data and instructions
 An arithmetic and logic unit capable on operating on binary data
 A control unit, which interprets the instructions in memory and causes
them to execute
 Input and Output equipment operated by control unit

Chapter # 2 Computer Organization & Architecture 10


Structure of von Neumann Machine IAS
S H E H E R YAR MALI K

Chapter # 2 Computer Organization & Architecture 11


IAS Details
S H E H E R YAR MALI K

 Memory of IAS consist of 1000 storage locations called words


 Each word is 40 bits
 Both data and instructions are stored there
 IAS computer had a total of 21 instructions
 Numbers must be represented in binary
 Each number is represented by a sign bit and a 39-bit value
 A word may also contain two 20-bit instructions
 Each instruction containing 8-bit operation code (opcode) specifying the
operation to be performed and 12-bit address designating one of the words in
memory
 Set of registers (storage in CPU)
 Memory Buffer Register
 Memory Address Register
 Instruction Register
 Instruction Buffer Register
 Program Counter
 Accumulator
 Multiplier Quotient
Chapter # 2 Computer Organization & Architecture 12
IAS – CPU Storage (Registers)
S H E H E R YAR MALI K

 Memory buffer register (MBR)


 Contains a word to be stored in memory or is to receive a word from memory
 Memory address register (MAR)
 Specifies the address in memory of the word to be written from or read into
the MBR
 Instruction register (IR)
 Contains the 8-bit opcode instruction being executed
 Instruction buffer register (IBR)
 Employed to hold temporarily the right hand instruction from a word in
memory
 Program counter (PC)
 Contains the address of the next instruction pair to be fetched from memory
 Accumulator & multiplier quotient (MQ)
 Employed to hold temporarily operands and results of ALU operations

Chapter # 2 Computer Organization & Architecture 13


IAS – Instruction Group
S H E H E R YAR MALI K

 Data transfer
 Move data between memory and ALU registers or between two ALU
registers
 Unconditional branch
 Normally control unit executes instructions in sequence from memory
 This sequence can be changed by branch instruction
 Conditional branch
 The branch can be made dependant on a condition, thus allowing a
decision point
 Arithmetic
 Operations performed by ALU
 Address modify
 Permits addresses to be computed in ALU and then inserted into
instructions stored in memory
Chapter # 2 Computer Organization & Architecture 14
Structure of IAS
S H E H E R YAR MALI K

Chapter # 2 Computer Organization & Architecture 15


Partial Flow Chart of IAS Operation
First Generation Commercial Computers
S H E H E R YAR MALI K

 UNIVAC I
 Universal Automatic Computer
 Used for both scientific and commercial applications
 It can perform matrix algebraic computations, statistical
problems, premium billing and logistical problems
 Developed in 1947 by Eckert-Mauchly Computer
Corporation
 Used in US Bureau of Census 1950 calculations
 UNIVAC II
 Faster and greater memory size than UNIVAC I
 It give new trends in technology

Chapter # 2 Computer Organization & Architecture 17


IBM
S H E H E R YAR MALI K

 IBM introduced 700 series based on vacuum tubes


 Punched-card processing equipment
 1953 - the 701
 IBM’s first stored program computer
 Scientific calculations
 1955 - the 702
 Business applications
 Lead to 700/7000 series

Chapter # 2 Computer Organization & Architecture 18


Second Generation - Transistors
S H E H E R YAR MALI K

 Replaced vacuum tubes with transistors


 Transistor is
 Smaller
 Cheaper
 Less heat dissipation
 Solid State device
 Made from Silicon (Sand)
 Invented 1947 at Bell Labs
 by William Shockley et al. Raytheon CK722 (1954)

Chapter # 2 Computer Organization & Architecture 19


Transistor Based Computers
S H E H E R YAR MALI K

 NCR & RCA produced small transistor machines


 DEC - 1957
 Produced PDP-1
 IBM introduced 7000 series which was based on transistors
in 1960
 IBM 7094
 Data channel is used
 it is an independent I/O module with its own processor and its own
instruction set
 Multiplexer is introduced
 it is the termination point for data channel, the CPU and memory
 Instruction backup register
 used to buffer the next instruction

Chapter # 2 Computer Organization & Architecture 20


Third Generation – Integrated Circuits
S H E H E R YAR MALI K

 The entire manufacturing process from transistor to circuit


board was expensive and cumbersome
 Early second generation computers contain about 10,000 transistors
 In 1958 came with the revolution of microelectronics
 the invention of integrated circuits (IC)
 IC’s become part of computers called digital computers
 Fundamental components of a digital computer are;
 Gates – Data Processing
 implements a simple Boolean or logical function
 Memory cells – Data Storage
 it is a device that can store bits of data
 Early IC’s are referred to as a small scale integration
 Then comes medium scale integration
Chapter # 2 Computer Organization & Architecture 21
Moore’s Law
S H E H E R YAR MALI K

 Gordon Moore – Co-founder of Intel


 The cost of a chip has remained virtually unchanged during this period of rapid
growth in density
 This means that cost of gates and memory circuits has fallen at a dramatic rate
 Due to shorter distance between logic and memory elements, processing speed
has increased
 The computer has become smaller
 The interconnections on IC’s are much more reliable than solder connections
 Increased density of components on chip
 Number of transistors on a chip will double every year
 However, Since 1970’s development has slowed a little
 Number of transistors doubles every 18 months
 Cost of a chip has remained almost unchanged
 Reduced power and cooling requirements
 Fewer interconnections increases reliability

Chapter # 2 Computer Organization & Architecture 22


Growth in CPU Transistor Count
IBM 360 series
S H E H E R YAR MALI K

 In 1964 IBM announced System/360


 It was incompatible with older IBM machines (7000 series)
 System/360 was the industry’s first planned family of
computers
 The characteristics of family of computers are as follows
 Similar or identical instruction set
 Similar or identical operating system
 Increasing speed
 Increasing number of I/O ports
 Increasing memory size
 Increasing cost
 Multiplexed switch structure
Chapter # 2 Computer Organization & Architecture 24
DEC PDP-8
S H E H E R YAR MALI K

 In 1964 DEC has announced PDP-8


 First minicomputer (after miniskirt!)
 Small size and low cost
 In beginning cost around $16,000
 whereas IBM 360 costs in hundreds of thousands of dollars
 PDP-8 followed a bus structure that is now universal for
minicomputers and microcomputers
 This bus is called Omnibus consist of 96 separate paths
 Its architecture is highly flexible allowing module to be plugged
into the bus to create various configurations
 Embedded applications & OEM
 Did not need air conditioned room
Chapter # 2 Computer Organization & Architecture 25
DEC - PDP-8 Bus Structure
Semiconductor Memory
S H E H E R YAR MALI K

 In 1950’s and 60’s most computer memories was constructed


from tiny rings of ferromagnetic material
 It was fast, but very expensive and bulky
 In 1970 Fairchild produces first semiconductor memory
 It took only 70 billionth of a second to read a bit
 Size of a single core
 i.e. 1 bit of magnetic core storage
 Holds 256 bits
 Capacity approximately doubles each year
 Since 1970 semiconductor memory has been through 13
generations
 Each generation has provided four times the storage density of the previous
generation, accompanied by declining cost per bit and declining access time

Chapter # 2 Computer Organization & Architecture 27


Intel Microprocessor
S H E H E R YAR MALI K

 Intel 4004 (1971)


 First microprocessor
 First chip to contain all CPU components on a single chip
 4 bit
 It can add two 4-bit numbers and can multiply by repeated additions
 Intel 8008 (1972)
 8 bit
 Both designed for specific applications
 Intel 8080 (1974)
 8 bit
 Intel’s first general purpose microprocessor
 This process going till now where Intel is producing 64 bit
microprocessors
Chapter # 2 Computer Organization & Architecture 28
Designing for Performance
S H E H E R YAR MALI K

 Today virtually free computer power


 In less than $1,000 we bought more than 1,000,000,000 transistors
 Today’s microprocessors systems includes
 Image processing
 Speech recognition
 Videoconferencing
 Multimedia authoring
 Voice and video annotation of files

Chapter # 2 Computer Organization & Architecture 29


Speeding it up
S H E H E R YAR MALI K

 Pipelining
 On board cache
 On board L1 & L2 cache and possibly L3 cache
 Branch prediction
 The processor looks ahead in the instruction code fetched from memory
and predicts which branches, or groups of instructions, are likely to be
processed next
 Data flow analysis
 The processor analyzes which instructions are dependent on each other’s
results, or data, to create an optimized schedule of instructions
 Speculative execution
 Using branch prediction and data flow analysis, some processors
speculatively execute instructions ahead of their actual appearance in
the program execution, holding the results in temporary locations
Chapter # 2 Computer Organization & Architecture 30
Performance Mismatch
S H E H E R YAR MALI K

 Processor speed increased


 Memory capacity increased
 Memory speed lags behind processor speed
 Memory capacity increases very fast but increase rate of memory
speed is very less

Chapter # 2 Computer Organization & Architecture 31


Processor and Memory Performance Gap
Solutions
S H E H E R YAR MALI K

 Increase number of bits retrieved at one time


 Make DRAM “wider” rather than “deeper”
 Change DRAM interface
 Cache and buffers on DRAM
 Reduce frequency of memory access
 More complex cache and cache on chip and off chip
 Increase interconnection bandwidth between
processors and memory
 High speed buses
 Hierarchy of buses

Chapter # 2 Computer Organization & Architecture 33


I/O Devices
S H E H E R YAR MALI K

 Peripherals with intensive I/O demands


 Large data throughput demands
 Processors can handle this
 Problem moving data
 Solutions:
 Caching
 Buffering
 Higher-speed interconnection buses
 More elaborate bus structures
 Multiple-processor configurations

Chapter # 2 Computer Organization & Architecture 34


Typical I/O Device Data Rates
S H E H E R YAR MALI K

Chapter # 2 Computer Organization & Architecture 35


Key is Balance
S H E H E R YAR MALI K

 Processor components
 Main memory
 I/O devices
 Interconnection structures

Chapter # 2 Computer Organization & Architecture 36


Improvements in Chip Organization and Architecture
S H E H E R YAR MALI K

 Increase hardware speed of processor


 Fundamentally due to shrinking logic gate size
 More gates, packed more tightly, increasing clock rate
 Propagation time for signals reduced
 Increase size and speed of caches
 Dedicating part of processor chip
 Cache access times drop significantly
 Change processor organization and architecture
 Increase effective speed of execution
 Parallelism

Chapter # 2 Computer Organization & Architecture 37


Problems with Clock Speed and Logic Density
S H E H E R YAR MALI K

 Power
 Power density increases with density of logic and clock speed
 Dissipating heat
 RC (Resistor-Capacitor) delay
 Speed at which electrons flow limited by resistance and capacitance of
metal wires connecting them
 Delay increases as RC product increases
 Wire interconnects thinner, increasing resistance
 Wires closer together, increasing capacitance
 Memory latency
 Memory speeds lag processor speeds
 Solution
 More emphasis on organizational and architectural approaches
Chapter # 2 Computer Organization & Architecture 38
Microprocessor Trend
S H E H E R YAR MALI K
Performance (vs. VAX-11/780)

Growth in processor performance since the late 1970s


Chapter # 2 Computer Organization & Architecture 39
Intel Microprocessor Trend
S H E H E R YAR MALI K

Chapter # 2 Computer Organization & Architecture 40


Processor Trend
S H E H E R YAR MALI K

Chapter # 2 Computer Organization & Architecture 41


Increased Cache Capacity
S H E H E R YAR MALI K

 Typically two or three levels of cache between


processor and main memory
 Chip density increased
 More cache memory on chip
 Faster cache access
 Pentium chip devoted about 10% of chip area to
cache
 Pentium 4 devotes about 50%

Chapter # 2 Computer Organization & Architecture 42


More Complex Execution Logic
S H E H E R YAR MALI K

 Enable parallel execution of instructions


 Pipeline works like assembly line
 Different stages of execution of different instructions at
same time along pipeline
 Superscalar allows multiple pipelines within single
processor
 Instructions that do not depend on one another can be
executed in parallel

Chapter # 2 Computer Organization & Architecture 43


Diminishing Returns
S H E H E R YAR MALI K

 Internal organization of processors complex


 Can get a great deal of parallelism
 Further significant increases likely to be relatively modest
 Benefits from cache are reaching limit
 Increasing clock rate runs into power dissipation
problem
 Some fundamental physical limits are being reached

Chapter # 2 Computer Organization & Architecture 44


New Approach – Multiple Cores
S H E H E R YAR MALI K

 Multiple processors on single chip


 Large shared cache
 Within a processor, increase in performance proportional to
square root of increase in complexity
 If software can use multiple processors, doubling number of
processors almost doubles performance
 So, use two simpler processors on the chip rather than one
more complex processor
 With two processors, larger caches are justified
 Power consumption of memory logic less than processing logic

Chapter # 2 Computer Organization & Architecture 45


Intel Evolution – Earlier Models
S H E H E R YAR MALI K

 4004
 It was a 4-bit microprocessor
 It was world’s First Microprocessor
 It addressed 4,096 4-bit wide memory locations
 It instruction set contained only 45 instructions
 Its speed was 50 KIPs
 This was slow when compared to the 100,000 instructions
per second by the 30-ton ENIAC computer in 1946. The
main difference was that the 4004 weighted much less
than one ounce
 8080
 first general purpose microprocessor
 8 bit data path
 Used in first personal computer – Altair
Chapter # 2 Computer Organization & Architecture 46
Intel Evolution – x86
S H E H E R YAR MALI K

 8086
 5MHz – 29,000 transistors
 much more powerful
 16 bit
 instruction cache, prefetch few instructions
 8088 (8 bit external bus) used in first IBM PC
 80286
 16 Mbyte memory addressable
 up from 1Mb
 80386
 32 bit
 Support for multitasking
 80486
 sophisticated powerful cache and instruction pipelining
 built in maths co-processor
Chapter # 2 Computer Organization & Architecture 47
Intel Evolution - Pentium
S H E H E R YAR MALI K

 Pentium
 Superscalar
 Multiple instructions executed in parallel
 Pentium Pro
 Increased superscalar organization
 Aggressive register renaming
 branch prediction
 data flow analysis
 speculative execution
 Pentium II
 MMX technology
 graphics, video & audio processing
 Pentium III
 Additional floating point instructions for 3D graphics
 Pentium 4
 Note Arabic rather than Roman numerals
 Further floating point and multimedia enhancements

Chapter # 2 Computer Organization & Architecture 48


Intel Evolution - Core
S H E H E R YAR MALI K

 Core
 First x86 with dual core
 Core 2
 64 bit architecture
 Core 2 Quad
 3GHz – 820 million transistors
 Four processors on chip
 Core i3, i5, i7
 Two to four processor on chip
 Seven generations
 Nehalem
 Sandy Bridge
 Ivy Bridge
 Haswell
 Broadwell
 Skylake
 Kabylake

Chapter # 2 Computer Organization & Architecture 49


Intel Evolution
S H E H E R YAR MALI K

 x86 architecture dominant outside embedded systems


 Organization and technology changed dramatically
 Instruction set architecture evolved with backwards
compatibility
 ~1 instruction per month added
 500 instructions available

 See Intel web pages for detailed information on processors

Chapter # 2 Computer Organization & Architecture 50


ARM Evolution
S H E H E R YAR MALI K

 Designed by ARM Inc., Cambridge, England in 1980


 Licensed to manufacturers
 High speed, small die, low power consumption
 PDAs, hand held games, phones
 E.g. iPod, iPhone
 Acorn produced ARM1 & ARM2 in 1985 and ARM3 in 1989
 Acorn, VLSI and Apple Computer founded ARM Ltd
 Most widely used 32-bit instruction set architecture in terms of quantity produced
in 2013
 In 2011 alone, producers of chips based on ARM architectures reported shipments
of 7.9 billion ARM-based processors, representing
 95% of smartphones
 90% of hard disk drives
 40% of digital televisions and set-top boxes
 15% of microcontrollers
 20% of mobile computer

Chapter # 2 Computer Organization & Architecture 51


ARM Evolution
S H E H E R YAR MALI K

Family Notable Features Cache Typical MIPS @ MHz


ARM1 32-bit RISC None  
ARM2 Multiply and swap instructions; None 7 MIPS @ 12 MHz
Integrated memory management
unit, graphics and I/O processor
ARM3 First use of processor cache 4 KB unified 12 MIPS @ 25 MHz
ARM6 First to support 32-bit addresses; 4 KB unified 28 MIPS @ 33 MHz
floating-point unit
ARM7 Integrated SoC 8 KB unified 60 MIPS @ 60 MHz
ARM8 5-stage pipeline; static branch 8 KB unified 84 MIPS @ 72 MHz
prediction
ARM9   16 KB/16 KB 300 MIPS @ 300 MHz
ARM9E Enhanced DSP instructions 16 KB/16 KB 220 MIPS @ 200 MHz
ARM10E 6-stage pipeline 32 KB/32 KB  
ARM11 9-stage pipeline Variable 740 MIPS @ 665 MHz
Cortex 13-stage superscalar pipeline Variable 2000 MIPS @ 1 GHz
XScale Applications processor; 7-stage 32 KB/32 KB L1 1000 MIPS @ 1.25 GHz
pipeline 512 KB L2

Chapter # 2 Computer Organization & Architecture 52


ARM Systems Categories
S H E H E R YAR MALI K

 Embedded
 ARM Cortex Embedded Processors (Cortex-M)
 Embedded real time
 ARM Cortex Real-time Embedded Processors (Cortex-R)
 Application platform
 ARM Cortex Application Processors (Cortex-A)
 Linux, Palm OS, Symbian OS, Windows mobile, Android
 Secure applications
 ARM Specialist Processors (SecurCore)

Chapter # 2 Computer Organization & Architecture 53


ARM® Cortex®-A Portfolio
S H E H E R YAR MALI K

as of Q4 2016

Cortex-A15 Cortex-A17 Cortex-A57 Cortex-A72 Cortex-A73


High- 2016
High-
performance with 2017 Hig
performance with Proven Premium
infrastructure lower power and
high-performance Mobile, Premium performance
h
feature set smaller area Infrastructure & Mobile,
relative to Cortex- 64/32-bit Auto Consumer
A15 64/32-bit 64/32-bit

Cortex-A8 Cortex-A9 Cortex-A53


Well-established,
Balanced Hig
First ARMv7- mid-range
A processor processor used in
performance and h
efficienc
efficiency y
many markets
64/32-bit

Cortex-A5 Cortex-A7 Cortex-A32 Cortex-A35


Smallest and Most efficient
lowest power ARMv7-A Smallest and
Ultra
ARMv7-A CPU, lowest power Highest high
efficienc
CPU, higher ARMv8-A efficiency y
optimized for performance than 32-bit 64/32-bit
single-core Cortex-A5
© ARM
2016
ARMv7- ARMv8-
A A

Chapter # 2 Computer Organization & Architecture 54


ARM® Cortex®-R Portfolio
S H E H E R YAR MALI K

as of Q4 2016

Cortex-R7 Cortex-R8
High Highest Storage
performance performance &
4G modem and 5G modem and modem
storage storage

Cortex-R4 Cortex-R5 Cortex-R52


Real-time Functiona
Most advanced
Real-time performance l safet
processor for
performance with functional
functional
y
safety
safety
ARMv7- ARMv8-
© ARM R R
2016

Chapter # 2 Computer Organization & Architecture 55


ARM® Cortex®-M and SecurCore® Portfolio
S H E H E R YAR MALI K

as of Q4 2016

Cortex-M3 Cortex-M4 Cortex-M7 Cortex-M33


Maximum Flexibility, Performance
Performance Mainstream
performance, control and DSP efficienc
efficiency control and DSP with TrustZone
control and DSP y

Cortex-M0 Cortex-M0+ Cortex-M23


TrustZone in Lowest
Lowest cost, Highest energy
smallest area, power & area
low power efficiency lowest power
Available via
DesignStart

SC000 SC300

Optimized area, Performance, SecurCore


anti-tampering anti-tampering

© ARM
2016
ARMv8-
M

Chapter # 2 Computer Organization & Architecture 56


Embedded Systems ARM
S H E H E R YAR MALI K

 ARM evolved from RISC design


 Used mainly in embedded systems
 Used within product
 Not general purpose computer
 Dedicated function
 E.g. Anti-lock brakes in car

Chapter # 2 Computer Organization & Architecture 57


Embedded Systems Requirements
S H E H E R YAR MALI K

 Different sizes
 Different constraints, optimization, reuse
 Different requirements
 Safety, reliability, real-time, flexibility, legislation
 Lifespan
 Environmental conditions
 Static v dynamic loads
 Slow to fast speeds
 Computation v I/O intensive
 Descrete event v continuous dynamics

Chapter # 2 Computer Organization & Architecture 58


Possible Organization of an Embedded System
S H E H E R YAR MALI K

Chapter # 2 Computer Organization & Architecture 59


Benchmarks
S H E H E R YAR MALI K

 Programs designed to test performance


 Written in high level language
 Portable
 Represents style of task
 Systems, numerical, commercial
 Easily measured
 Widely distributed
 E.g. System Performance Evaluation Corporation (SPEC)
 CPU2006 for computation bound
 17 floating point programs in C, C++, Fortran
 12 integer programs in C, C++
 3 million lines of code
 Speed and rate metrics
 Single task and throughput
Chapter # 2 Computer Organization & Architecture 60

You might also like