You are on page 1of 68

Unit-1

Fundamentals – Computer Architecture and


Technology Trends
By
Dr. Sarvesh Vishwakarma
Professor – CSE
TCS 704 - Advanced Computer Architecture

Copyright © 2012, Elsevier Inc. All rights reserved. 1


Why we need computer
architecture?????????

Copyright © 2012, Elsevier Inc. All rights reserved. 2


Sequential Machine

Copyright © 2012, Elsevier Inc. All rights reserved. 3


Sequential Machine
 Problem with Von Neumann Model
 Speed of Information Exchange between
Memory and Processor
 Execution Rate of Information

 Solution
 Use of Cache memory
 Use of Pipelining Concepts: Overlapping the
execution of instructions

Copyright © 2012, Elsevier Inc. All rights reserved. 4


Sequential Machine
 Restrictions
 Cache memory speed is limited by technology

Speed
Data
increases
Storage
Memory
Cach
CPU e Size
increase

Copyright © 2012, Elsevier Inc. All rights reserved. 5


Sequential Machine
 Pipelining are useful only in some cases
 Due to Pipelining hazards

 Data Hazards
 Branch Hazards

 Resource Hazards

Copyright © 2012, Elsevier Inc. All rights reserved. 6


Sequential Machine
 Don’t go for Sequential Machine
architecture
 Use Parallel Architectures

 Flynn’s Taxonomy
 SISD single instruction single data stream
 SIMD single instruction multiple data stream
 MISD multiple instruction single data stream
 MIMD Multiple instruction multiple data stream

Copyright © 2012, Elsevier Inc. All rights reserved. 7


Classes of Computers
Flynn’s Taxonomy
 Single instruction stream, single data stream (SISD)

 Single instruction stream, multiple data streams (SIMD)


 Vector architectures
 Multimedia extensions
 Graphics processor units

 Multiple instruction streams, single data stream (MISD)


 No commercial implementation

 Multiple instruction streams, multiple data streams


(MIMD)
 Tightly-coupled MIMD
 Loosely-coupled MIMD

Copyright © 2012, Elsevier Inc. All rights reserved. 8


Introduction
Single Processor Performance
Move to multi-processor

RISC

Copyright © 2012, Elsevier Inc. All rights reserved. 9


Classes of Computer
1. Mainframe
2. Minicomputer
3. Supercomputer
4. Desktop computer
5. Server
6. Embedded computer

Copyright © 2012, Elsevier Inc. All rights reserved. 10


Classes of Computer
 Mainframe
 Year 1960
 Costly
 Large size
 Multi-user
 Application: Data Processing & Scientific
Computing (Bank, Government, Corporate)
 Response 100 sec. for million user

Copyright © 2012, Elsevier Inc. All rights reserved. 11


Classes of Computer
 Minicomputer
 Year 1970
 Costlier
 Small size
 Multi-user
 Application: Data Processing & Scientific
Computing (Scientific Laboratory)
 Time-sharing
 Supercomputer
 Year 1970

Copyright © 2012, Elsevier Inc. All rights reserved. 12


Classes of Computer
 Desktop Computer
 Year 1980
 Less Costlier
 Small size
 Feature: Microprocessor
 Two Classes:
 Personal Computer also called Micro-computer
Alternate of timesharing minicomputer, flexible and
meet a wide range of end user needs
 Workstation: single user and contain special
hardware

Copyright © 2012, Elsevier Inc. All rights reserved. 13


Classes of Computer
 Server
 Year 1980
 Costlier
 Size
 Dedicated to provide large scale services
 Reliable
 Long-term file storage and access
 Large memory
 More computing power

Copyright © 2012, Elsevier Inc. All rights reserved. 14


Classes of Computer
 Server
 Characteristics
 Availability
 Operate 7 days a week 24 hours a day

 Scalability

In response to the growing demand of the services Server


often grow
 Long-term file storage and access

 Large memory

 More computing power

 Efficient Throughput

Copyright © 2012, Elsevier Inc. All rights reserved. 15


Classes of Computer
 Server
 Cost of Downtime
Application Cost of downtime per 1% (87.6 0.5% (43.8 0.1% (8.8
hour (thousands of $) hrs/yr) hrs/yr) hrs/yr)

Brokerage operations 6450 565 283 56.5

Credit card authorization 2600 228 114 22.8

Package shipping services 150 13 6.6 1.3

Home shopping channel 113 9.9 4.9 1.0

Catalog sales center 90 7.9 3.9 0.8


Airline reservation center 89 7.9 3.9 0.8
Cellular service activation 41 3.6 1.8 0.4
Online network fees 25 2.2 1.1 0.2
ATM service fees 14 1.2 0.6 0.1

Copyright © 2012, Elsevier Inc. All rights reserved. 16


Classes of Computer
 Personal Digital Assistant
 Year 1990
 First hand held computing devices
 High performance digital consumer electronics
video game, set-top box
 Embedded Computer
 Year 2000
 Handle particular task
 Reduce the size and product cost
 Cellphone, digital watches, mp3 player, factory
controller
Copyright © 2012, Elsevier Inc. All rights reserved. 17
Classes of Computer
 Comparison of three computing classes
and their system characteristics
Feature Desktop Server Embedded

Price of system $1000 - $10,000 $10,000- $10-$100,000


$10,000,000
Price of microprocessor module $100-$1000 $200-$2000 $0.20-$200

Microprocessors sold per year 150,000,000 4,000,000 300,000,000


(estimates for 2000)
Critical system design issues Price- Throughput, Price, power
performance, availability, consumption,
graphics scalability application-
performance specific
performance

Copyright © 2012, Elsevier Inc. All rights reserved. 18


Classes of Computer
 Summary of some of the most important functional requirements an
architect faces

Copyright © 2012, Elsevier Inc. All rights reserved. 19


Instruction Set Architecture
 Class of ISA
 80x86
 MIPS
 Memory Addressing
 MIPS follow aligned addressing
 80x86 Non aligned addressing
 Addressing Modes
 MIPS
 Register addressing mode
 Immediate addressing mode
 Displacement addressing mode

Copyright © 2012, Elsevier Inc. All rights reserved. 20


Instruction Set Architecture
 Types and Size of Operands
 80x86/MIPS: support operand size
 8 bit (ASCII); 16 bit (Unicode character);
 32 bit (integer); 64 bit (long integer) ;
 80 bit extended double precision
 IEEE 754 floating point:
 32 bit single precision
 64 bit double precision
 Operations
 Data transfer
 Arithmetic logical
 Control
 Floating point

Copyright © 2012, Elsevier Inc. All rights reserved. 21


Instruction Set Architecture
 Control Flow Instructions
 80x86/MIPS: support
 Conditional branch
 Unconditional Jump
 Procedure call
 Returns
 80x86:
 JE; JNE
 MIPS
 BE; BNE
 Encoding an ISA
 Fixed length Data transfer
 Variable length

Copyright © 2012, Elsevier Inc. All rights reserved. 22


Integrated Circuits: Fueling Innovation

 Chips begins with silicon, found in sand


 Silicon does not conduct electricity well and thus
called semiconductor
 A special chemical process can transform tiny
areas of silicon to either:
 Excellent conductors of electricity (like copper)
 Excellent insulator from electricity (like glass)
 Areas that can conduct or insulate (a switch)
 A transistor is simply an on/off switch controlled
by electricity
 Integrated circuits combines dozens of hundreds
of transistors in a chip
23
Trends in Cost
 Cost of an Integrated Circuit

Objective: Derive a formula for Cost of an


Integrated Circuit.

Die_Cost  Testing_Cost  Packing_Cost  Final_Test_Cost


IC_Cost 
Final_Test_Yield

Copyright © 2012, Elsevier Inc. All rights reserved. 24


Microelectronics Process

20-30
Slices
processing
steps

Die
Package Dice
Test

Package
Ship
Test

 Silicon ingots:
 6-12 inches in diameter and about 12-24 inches long
 Impurities in the wafer can lead to defective devices and reduces the yield

25
Average cutting edge for non-square pieces
rd s
2
s
Irregular 2
cutting
edge s
2
s
2

pieces rd 
2
   
2 2
s2 s2
rd 2  
4 4
2 s2
Total rd 2 
No. of 4
 Wafer _ diameter 
non-square  2 s2
average _ cutting _ length rd 2 
pieces 2 2
  (Wafer _ Diameter )
 s s
rd rd 2  2  
2 2
  (Wafer _ Diameter )

2  Die _ Area rd 2  2  ( Die _ Area )
rd  2  ( Die _ Area )
26
Integrated Circuits Costs

27
Trends in Cost
 Factors that influence the Cost of
Computer
1. Time

2. Volume

3. Commodification

Copyright © 2012, Elsevier Inc. All rights reserved. 28


What Affects Cost?
1. Learning curve:
 The more experience in manufacturing a component, the better
the yield
 In general, a chip, board or system with twice the yield will have
half the cost.
 The learning curve is different for different components,
complicating design decisions
2. Volume
 Larger volume increases rate of learning curve
 Doubling the volume typically reduce cost by 10%
3. Commodities
 Are essentially identical products sold by multiple vendors in
large volumes
 Foil the competition and drive the efficiency higher and thus the
cost down

29
Intel Motherboard Components

30
Computer
Components

31
Trends in Cost
Trends in Cost
 Cost driven down by learning curve
 Yield

 DRAM: price closely tracks cost

 Microprocessors: price depends on


volume
 10% less for each doubling of volume

Copyright © 2012, Elsevier Inc. All rights reserved. 32


Trends in Cost
Integrated Circuit Cost
 Integrated circuit

 Bose-Einstein formula:

 Defects per unit area = 0.016-0.057 defects per square cm (2010)


 N = process-complexity factor = 11.5-15.5 (40 nm, 2010)

Copyright © 2012, Elsevier Inc. All rights reserved. 33


Dependability
•Service accomplishment:- services is delivered as specified
•Service interruption:- delivered service is different from the service level

agreement
•Module Reliability:- measure of continuous service accomplishment from a

reference initial instant.


•MTTF:- mean time to failure

Reciprocal of MTTF is a rate of failures per


1 billion hours of operation or FIT
Rate of failure = MTTF

•MTBF:- measure of reliability for repairable system but commonly used for
both repair and non-repair system
If used for repair system
MTBF  MTTR  MTTF

FIT:- number of expected failures per one billion hours of operation for a

device.

34
MTTF Dependability
Operating
properly

Repair

ref t0 t1 t2 t3 t4 t5 t6

Second Third Fourth


First
failure failure failure
failure

Mean time between  t 2  t 0


failure  t 2  t1   t1  t 0 
 operating  repair
 MTTF  MTTR

MTBF  MTTF  MTTR


35
simple failure model
Non-repairable component
uptime/operation

downtime

ref
MTTF

Renewal failure model


Repairable component

uptime/operation uptime

MTTR
downtime

ref
MTBF

36
Dependability
 Module reliability
 Mean time to failure (MTTF)
 Mean time to repair (MTTR)
 Mean time between failures (MTBF) = MTTF + MTTR
 Module Availability
MTTF
Availabili ty 
MTBF
MTTF
Availabilt y 
(MTTF  MTTR)
 Module availability:- measure of the service
accomplishment with respect to the alternation between
the two states of accomplishment and interruption.

37
Real World Examples

From "Estimating IC Manufacturing Costs,” by Linley Gwennap,


Microprocessor Report, August 2, 1993, p. 15

38
Performance Metrics
 Response (execution) time:
 The time between the start and the completion of a task
 Measures user perception of the system speed
 Common in reactive and time critical systems, single-user computer, etc.
 Throughput:
 The total number of tasks done in a given time
 Most relevant to batch processing (billing, credit card processing)
 Mainly used for input/output systems (disk access, printer, etc.)

39
Introduction
Computer Technology
 Performance improvements:
 Improvements in semiconductor technology
 Feature size, clock speed
 Improvements in computer architectures
 Enabled by HLL compilers, UNIX
 Lead to RISC architectures

 Together have enabled:


 Lightweight computers
 Productivity-based managed/interpreted
programming languages

Copyright © 2012, Elsevier Inc. All rights reserved. 40


Introduction
Current Trends in Architecture
 Cannot continue to leverage Instruction-Level
parallelism (ILP)
 Single processor performance improvement ended in
2003

 New models for performance:


 Data-level parallelism (DLP)
 Thread-level parallelism (TLP)
 Request-level parallelism (RLP)

 These require explicit restructuring of the


application

Copyright © 2012, Elsevier Inc. All rights reserved. 41


Classes of Computers
Classes of Computers
 Personal Mobile Device (PMD)
 e.g. start phones, tablet computers
 Emphasis on energy efficiency and real-time
 Desktop Computing
 Emphasis on price-performance
 Servers
 Emphasis on availability, scalability, throughput
 Clusters / Warehouse Scale Computers
 Used for “Software as a Service (SaaS)”
 Emphasis on availability and price-performance
 Sub-class: Supercomputers, emphasis: floating-point
performance and fast internal networks
 Embedded Computers
 Emphasis: price

Copyright © 2012, Elsevier Inc. All rights reserved. 42


Defining Computer Architecture
Defining Computer Architecture
 “Old” view of computer architecture:
 Instruction Set Architecture (ISA) design
 i.e. decisions regarding:
 registers, memory addressing, addressing modes,
instruction operands, available operations, control flow
instructions, instruction encoding

 “Real” computer architecture:


 Specific requirements of the target machine
 Design to maximize performance within constraints:
cost, power, and availability
 Includes ISA, microarchitecture, hardware

Copyright © 2012, Elsevier Inc. All rights reserved. 43


Trends in Technology

Trends in Technology
 Integrated circuit logic technology
 Transistor density: 35%/year
 Die size: 10-20%/year
 Integration overall: 40-55%/year

 Semiconductor DRAM capacity: 25-40%/year (slowing)


 Flash capacity: 50-60%/year
 15-20X cheaper/bit than DRAM

 Magnetic disk technology: 40%/year


 15-25X cheaper/bit then Flash
 300-500X cheaper/bit than DRAM

 Network technology: depend on performance of switches


and performance of transmission system

Copyright © 2012, Elsevier Inc. All rights reserved. 44


Trends in Technology
Bandwidth and Latency
 Bandwidth or throughput
 Total work done in a given time
 10,000-25,000X improvement for processors
 300-1200X improvement for memory and disks

 Latency or response time


 Time between start and completion of an event
 30-80X improvement for processors
 6-8X improvement for memory and disks

Copyright © 2012, Elsevier Inc. All rights reserved. 45


Trends in Technology
Scaling of Transistor Performance and Wires

 Feature size
 Minimum size of transistor or wire in x or y
dimension
 10 microns in 1971 to .032 microns in 2011
 Transistor performance scales linearly
 Wire delay does not improve with feature size!
 Integration density scales quadratic ally

Copyright © 2012, Elsevier Inc. All rights reserved. 46


Trends in Technology
Performance trends: Bandwidth over Latency

Log-log plot of bandwidth and latency milestones

Copyright © 2012, Elsevier Inc. All rights reserved. 47


Trends in Power and Energy
Trends in Power and Energy
 Problem: Get power in, get power out

 Thermal Design Power (TDP)


 Characterizes sustained power consumption
 Used as target for power supply and cooling system
 Lower than peak power, higher than average power
consumption

 Clock rate can be reduced dynamically to limit


power consumption

 Energy per task is often a better measurement


Copyright © 2012, Elsevier Inc. All rights reserved. 48
Trends in Power and Energy
Dynamic Energy and Power
 Dynamic energy
 Transistor switch from 0 -> 1 or 1 -> 0
 ½ x Capacitive load x Voltage2

 Dynamic power
 ½ x Capacitive load x Voltage2 x Frequency switched

 Reducing clock rate reduces power, not energy

Copyright © 2012, Elsevier Inc. All rights reserved. 49


Trends in Power and Energy
Power
 Intel 80386
consumed ~ 2 W
 3.3 GHz Intel
Core i7 consumes
130 W
 Heat must be
dissipated from
1.5 x 1.5 cm chip
 This is the limit of
what can be
cooled by air

Copyright © 2012, Elsevier Inc. All rights reserved. 50


Trends in Power and Energy
Reducing Power
 Techniques for reducing power:
 Do nothing well
 Dynamic Voltage-Frequency Scaling
 Low power state for DRAM, disks
 Overclocking, turning off cores

Copyright © 2012, Elsevier Inc. All rights reserved. 51


Trends in Power and Energy
Static Power
 Static power consumption
 Currentstatic x Voltage
 Scales with number of transistors
 To reduce: power gating

Copyright © 2012, Elsevier Inc. All rights reserved. 52


Measuring Performance
Measuring Performance
 Typical performance metrics:
 Response time
 Throughput

 Speedup of X relative to Y
 Execution timeY / Execution timeX

 Execution time
 Wall clock time: includes all system overheads
 CPU time: only computation time

 Benchmarks
 Kernels (e.g. matrix multiply)
 Toy programs (e.g. sorting)
 Synthetic benchmarks (e.g. Dhrystone)
 Benchmark suites (e.g. SPEC06fp, TPC-C)

Copyright © 2012, Elsevier Inc. All rights reserved. 53


Principles
Principles of Computer Design
 Take Advantage of Parallelism
 e.g. multiple processors, disks, memory banks,
pipelining, multiple functional units

 Principle of Locality
 Reuse of data and instructions

 Focus on the Common Case


 Amdahl’s Law

Copyright © 2012, Elsevier Inc. All rights reserved. 54


Principles
Principles of Computer Design
 The Processor Performance Equation

Copyright © 2012, Elsevier Inc. All rights reserved. 55


Principles
Principles of Computer Design
 Different instruction types having different
CPIs

Copyright © 2012, Elsevier Inc. All rights reserved. 56


A 400-MHz processor was used to execute a
benchmark program with the following instruction
mix and clock cycle counts:

Instruction type Instruction count Clock cycle count

Integer Arithmetic 450000 1

Data transfer 320000 2

Floating point 150000 2

Control transfer 80000 2

Determine the effective CPI, MIPS rate, and


execution time for this program.

57
Question: Suppose that we want to enhance the
processor used for web serving. The new
processor is 10 times faster on computation in
the web serving application than the original
processor. Assuming that the original
processor is busy with computation 40% of the
time and is waiting I/O 60% of the time, what is
the overall speedup gained by incorporating
the enhancement?

Copyright © 2012, Elsevier Inc. All rights reserved. 58


A 400-MHz processor was used to execute a
benchmark program with the following instruction
mix and clock cycle counts:

Instruction type Instruction count Clock cycle count

Integer Arithmetic 450000 1

Data transfer 320000 2

Floating point 150000 2

Control transfer 80000 2

Determine the effective CPI, MIPS rate, and


execution time for this program.

59
Question??

Q1. Find the number of dies per 300mm (30 cm)


wafer for a die that is 1.5 cm on a side.

Q2. Find the die yield for dies that are 1.5 cm on a
side and 1.0 cm on a side, assuming a defect
density of 0.4 per cm2 and α is 4.

60
f1 = 500 MHz f2 = 2.5 GHz
T1 = 12x seconds T2 = x seconds
MIPS Rate1 = 100 MIPS MIPS Rate2 = 1800 MIPS
Copyright © 2012, Elsevier Inc. All rights reserved. 61
CPI1 = ? CPI2 = ?
Ic = ? Ic = ?
Throughput1 = ? Throughput2 = ?

Copyright © 2012, Elsevier Inc. All rights reserved. 62


Ex1. The execution times (in seconds) of four programs on
three computers are given below: Assume that 109 instructions
were executed in each of the four programs. Calculate the
MIPS rating of each program on each of the three machines.
Based on these ratings, can you draw a clear conclusion
regarding the relative performance of the three computers?
Give reasons if you find a way to rank them statistically.
Program Execution Time (in seconds)

Computer A Computer B Computer C

Program 1 1 10 20

Program 2 1000 100 20

Program 3 500 1000 50

Program 4 100 800 100

Copyright © 2012, Elsevier Inc. All rights reserved. 63


Q1. Assume a disk subsystem with the following components and MTTF
10 disks, each rated at 1,000,000-hour MTTF
1 SCSI controller, 500,000-hour MTTF

1 power supply, 200,000-hour MTTF

1 fan, 200,000-hour MTTF

1 SCSI cable, 1,000,000-hour MTTF

Using the simplifying assumptions that the lifetimes are exponentially distributed and that
failures are independent, compute the MTTF of the system as a whole.
1 1 1 1 1
 10     
Failure rate system 1000,000 500000 200000 200000 1000000
10  2  5  5  1 23 23  1000
  
1,000,000 hours 1,000,000 1000,000,000 hours
23,000

1 billion hours
or, 23000 FIT
1 1,000,000,000 hours
MTTFsystem    43,500 hours
Failure rate system 23,000
1 years  364  24 hours  8736 hours
43500
Therefore, MTTFsystem   4.979 years
8736
64
Q2. Availability is the most important consideration
for designing servers, followed closely by scalability
and throughput.
(a)We have a single processor with a failures in time (FIT) of
100. What is the mean time to failure (MTTF) for this system?
(b)If it takes 1 day to get the system running again, what is the
availability of the system?
(c)Imagine that the government, to cut costs, is going to build a
supercomputer out of inexpensive computers rather than
expensive, reliable computers. What is the MTTF for a system
with 1000 processors? Assume that if one fails, they all fail.

65
1
a) MTTFsystem 
Failure ratesystem
1,000,000,000 hours

100
 107 hours

Copyright © 2012, Elsevier Inc. All rights reserved. 66


b) MTTR  24 hours
MTTF  107 hours
MTTF
System availability 
MTTF  MTTR
107
 7
10  24
10,000,000 10,000,000
   0.999  1
10,000,000  24 10,000,024

67
1
c ) Failure ratesystem  1000  7
10
1000  100

107  100
100,000
 9
10 hours
100000

1 billion hours
100,000 FIT
1000,000,000
MTTFsystem   10,000 hours
100,000

68

You might also like