You are on page 1of 15

8/21/2021

Computer Architecture 4304


Fall 2021
Dr. Jeanne K. Pitz

About me
 Dr. Jeanne Pitz, PhD from SMU 1989
 TI Fellow, Retired after 30 years as a circuit designer
 Last 5 years Automotive Sensors with embedded custom processors
 Started teaching part time at UTD after retiring 2013
 Prior to retiring, taught part time in evening 1983-1996
 First in CS department taught C, pascal and Operating systems
 Then EE department, Electrical Networks, Electronics, Computer Architecture
 Took a break when traveling a lot for TI 1996-2013
 Office in ECS 4.312 (hallway to the EE office, next to Open Lab)
 Email: jxp133430@utdallas.edu
 Office hours: T,TH 1pm-2pm or email to set up remote teams meeting.

Chapter 1 — Computer Abstractions and Technology — 2

1
8/21/2021

Schedule 2010
week no Mon Wed topics
1 23-Aug 25-Aug Intro, prerequisits, logistics, outline of course,performance analysis
2 30-Aug 1-Sepinstruction set architecture, whats inside,program, execution,
3 6-Sep 8-Sepinstructions, addressing for computers, memory vs registers,memory organiztion
4 13-Sep 15-Sep,instruction execution, data ops, data transfer, sequencing instructions. Ifs and loops, signed binary numbers
5 20-Sep 22-Sep exam 1 A exam 1 B
6 27-Sep 29-Seplarge constants, sign ext adressing modes summary, immediates, procdure calls,recursive, stack
7 4-Oct 6-Octprocessor, muxes, demuxes, encoders, decoders, memory combinatorial and sequential logic, counter, speed, single cycle design
8 11-Oct 13-Octsingle cycle, fetch, creating datapath, register file, connected to alu, datapath options
9 18-Oct 20-Octcontrol signal for datapath, decoding instructions, simple single cycle datapath, processor pipelining,
10 25-Oct 27-Oct exam 2 A exam 2 B
11 1-Nov 3-Novcontrol unit design, pipeling for performance, pipeline control, pipeline hazards MIPS pipeline, strategies for speed and eliminating hazards
12 8-Nov 10-Novstructural, stalling, branch, and data hazards, fowarding ,predicting branching, prediction, dyn br predict, Memory Strategies, large/fast, caching
13 15-Nov 17-NovMemory hierarchy, caching ,caching continued, caching methods
14 22-Nov 24-NovThanksgiving break
15 29-Nov 1-Dec exam 3 A exam 3 B
16 6-Dec last day off class
comprehensive Final in the Final Exam Slot.

Computer Architecture Topics


 CPU performance analysis
 Instruction set design
 illustrated by the MIPS instruction set architecture
and others
 Systems-level view of computer arithmetic
 Design of the data path
 Control for a simple processor
 Pipelining
 Hierarchical memory
 I/O systems
 I/O performance analysis
 Multiprocessing

2
8/21/2021

Prerequisite: 3320 Digital Circuits


 Topics from prerequisites:
 combinational logic circuits
 basic logic gates
 building blocks like multiplexers and ROMs.
 latches and flip-flops.
 synchronous state machines.
 State minimization
 state assignment.
 data path components:
 adders,
 multipliers
 registers,
 shifters
 counters
 Electrical properties of logic gates.

Classes of Computing Applications

 Personal computers (PC)


 Single users
 Low cost
 Execute 3rd party software
 Servers
 Larger computers
 Larger workloads
 Many small jobs
 widest range of cost and capability

3
8/21/2021

Classes of Computing Applications, con’t


 Embedded computers
 Largest class
 Widest range of applications
 From auto, TV, airplane, to cargo ships
 Have most stringent requirements
 max performance, low cost, low power

Post PC era
 Personal Mobile devices (PMD)
 Battery operated
 Wireless connectivity
 Cost $100+ dollars
 Users can download apps
 No longer have a keyboard
 Smart phone or tablet
 Cloud computing
 Giant data centers

4
8/21/2021

Find out what type of computer you own:


 Dell Inspiron 13
 5000 series
 Windows 10
 Under Settings-> About
 Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz 2.70 GHz
 8.00 GB (7.87 GB usable)
 64-bit operating system, x64-based processor

Intel(R) Core(TM) i5-7200U CPU


 Intel(R) Core(TM) i5-7200U CPU
Lithography 14 nm
# of Cores 2
Processor Base Frequency 2.50 GHz
Max Turbo Frequency 3.10 GHz
Cache 3 MB Intel® Smart Cache
Bus Speed 4 GT/s I
Intel® Turbo Boost Technology 2.0 Frequency 3.1 GHz ‡

Max Memory Size (dependent on memory type) 32 GB


Memory Types DDR4-2133, LPDDR3-1866, DDR3L-1600
Max # of Memory Channels 2
Max Memory Bandwidth 34.1 GB/s

Chapter 1 — Computer Abstractions and Technology — 10

5
8/21/2021

Moore’s law
• “Cramming more components onto integrated circuits.”
- G.E. Moore, Electronics 1965
–Observation: DRAM transistor density doubles
annually
•Became known as “Moore’s Law”
•Actually, a bit off:
–Density doubles every 18 months (now more like 24)
–(in 1965 they only had 4 data points!)
–Corollaries:
•Cost per transistor halves annually (18 months)
•Power per transistor decreases with scaling
•Speed increases with scaling
•Memory capacity doubles every 18-24 months
–Of course, it depends on how small you try to make things
» (I.e. no exponential lasts forever)
Remember these!

Computer Advances 1950s - Present Day

 1951 Vacuum tubes


 1965 Transistor
 1975 Integrated circuit 900
 1995 Very large-scale integrated circuit
 2.4𝑥10 = 2,400,000
 2013 Ultra large-scale integrated circuit
 2.5𝑥10 = 250,000,000,000

6
8/21/2021

What we will cover:

 Instruction Set Architecture


 Arithmetic for Computers
 The Processor
 Exploiting Memory Hierarchy
 Parallel Processors from Client to Cloud

Chapter 1 — Computer Abstractions and Technology — 13

What is computer architecture


 A modern example: big.LITTLE

Chapter 1 — Computer Abstractions and Technology — 14

7
8/21/2021

Why is ARM doing this?

Chapter 1 — Computer Abstractions and Technology — 15

How can they do this?

Chapter 1 — Computer Abstractions and Technology — 16

8
8/21/2021

What is computer Architecture


 Arm
 MIPS

ARM Architecture

Chapter 1 — Computer Abstractions and Technology — 18

9
8/21/2021

ARM Block Diagram

Chapter 1 — Computer Abstractions and Technology — 19

What is this big.LITTLE thing?

10
8/21/2021

Example MIPS architecture

 MIPs developed at Stanford U. in early 1980s


 Dr. John Hennessy started the development
 RISC vs CISC
 Studied compilers
 Smaller simpler instruction set
 Each instruction ran in a single clock cycle
 Processor used a technique called pipelining
 32 bit registers , 32 bit word
 111 instructions

MIPS architecture, con’t


$r7 $r12
Example format: add $r12, $r7, $r8

$r8

 21 arithmetic instructions (+, -, *, /, %)


 8 logic instructions (&, |, ~)
 8 bit manipulation instructions
 12 comparison instructions (>, <, =, >=, <=, ¬)
 25 branch/jump instructions
 15 load instructions
 10 store instructions
 8 move instructions
 4 miscellaneous instructions

11
8/21/2021

This is computer architecture

Chapter 1 — Computer Abstractions and Technology — 23

Basic Metrics

24

12
8/21/2021

Response Time and Throughput

• throughput: work per unit time


– = (1 / latency) when there is NO OVERLAP
10 time units
– > (1 / latency) when there is overlap Finish
each
• in real processors there is always overlap time unit

– good metric for fixed amount of time (maximize work)

Instructions: Language of the Computer


 Operands of the Computer Hardware
 Signed and Unsigned Numbers
 Representing Instructions in the Computer
 Logical Operations
 Instructions for Making Decisions
 Supporting Procedures in Computer Hardware
 Communicating with People
 MIPS Addressing for 32-Bit Immediates and Addresses
 Parallelism and Instructions: Synchronization
 Translating and Starting a Program
 A C Sort Example to Put It All Together
 Arrays versus Pointers

13
8/21/2021

Arithmetic for Computers

 Addition and Subtraction


 Multiplication
 Division
 Floating Point
 Parallelism and Computer Arithmetic: Subword Parallelism
 Streaming SIMD Extensions and Advanced Vector
 Extensions in x86
 Going Faster: Subword Parallelism and Matrix Multiply

4 The Processor
 Logic Design Conventions
 Building a Datapath
 A Simple Implementation Scheme
 An Overview of Pipelining
 Pipelined Datapath and Control
 Data Hazards: Forwarding versus Stalling
 Control Hazards
 Exceptions
 Parallelism via Instructions
 Th e ARM Cortex-A8 and Intel Core i7 Pipelines
 Going Faster: Instruction-Level Parallelism and Matrix Multiply
 Advanced Topic: Using a Hardware Design Language

14
8/21/2021

Exploiting Memory Hierarchy


 Memory Technologies
 The Basics of Cache
 Measuring and Improving Cache Performance
 Dependable Memory Hierarchy
 Virtual Machines
 Virtual Memory
 Using a Finite-State Machine to Control a Simple Cache
 Cache Coherence
 Parallelism and Memory Hierarchies:
 Redundant Arrays of Inexpensive Disks
 Implementing Cache Controllers
 The ARM Cortex-A8 and Intel Core i7 Memory Hierarchies
 Cache Blocking and Matrix Multiply

Parallel Processors from Client to Cloud


 SISD, MIMD, SIMD, SPMD, and Vector
 Hardware Multithreading
 Multicore and Other Shared Memory Multiprocessors
 Introduction to Graphics Processing Units
 Clusters, Warehouse Scale Computers, and Other Message-Passing
Multiprocessors
 Introduction to Multiprocessor Network Topologies
 Communicating to the Outside World: Cluster Networking
 Multiprocessor Benchmarks and Performance Models
 Benchmarking Intel Core i7 versus NVIDIA Tesla GPU
 Multiple Processors and Matrix Multiply

15

You might also like