Professional Documents
Culture Documents
Module1 CA PDF Final
Module1 CA PDF Final
by
2 “Advanced Computer Architecture a Systems Design Approach”, Richard Y. Kain, PHI, 2nd edition 2011. ISBN: 978-
8131702086 (Module 4 &5)
3 “Computer Architecture a quantitative approach”, John L. Hennessy and David A. Patterson, Elsevier, 4th Edition,
2017. ISBN: 978-0128119051
4 “Computer Architecture and Parallel Processing”, Kai Hwang and Faye Briggs, Mc GrawHill International Edition,
2000. ISBN: 9781259029141.
5 “Computer Organization” , Carl Hamacher, Zvonks Vranesic, SafeaZaky, McGraw Hill, 5th Edition, 2011. ISBN: 978-
1259005275
6 “Structured Computer Organization”, Andrew S. Tanenbaum, Pearson /PHI , 4th Edition, 2012, ISBN: 978-0132916523
7 “Computer Organization and Embedded Systems”, Safwat Zaky and Naraig Manjikian, Tata McGraw Hill, 6th Edition,
2012. ISBN : 978-0073380650
8 “Computer Organization and Architecture”, Sarangi, White Falcon Publishing; 1st edition, 2021, ISBN : 978-
1636403038
Introduction
• The program stored in the memory determines the processing steps. Basically
the computer converts one source program to an object program. i.e. into
machine language.
• Finally the results are sent to the outside world through output device.
• Addresses are numbers that identify memory location. Number of bits in each
word is called word length of the computer. Programs must reside in the
memory during execution.
• Instructions and data can be written into the memory or read out under the
control of processor.
• Memory in which any location can be reached in a short and fixed amount of
time after specifying its address is called random-access memory (RAM).
• The time required to access one word in called memory access time. Memory
which is only readable by the user and contents of which can’t be altered is
called read only memory (ROM) it contains operating system.
Secondary memory
• It is used where large amounts of data & programs must be stored,
particularly information that is accessed infrequently.
• Examples:
Magnetic disks & tapes, optical disks (ie CD-ROM’s), floppies etc.,
Arithmetic logic unit (ALU)
• Most of the computer operators are executed in ALU of the processor like
addition, subtraction, division, multiplication, etc. the operands are
brought into the ALU from memory and stored in high-speed storage
elements called register.
• The actual timing signals that govern the transfer of data between
input unit, processor, memory and output unit are generated by the
control unit.
Hardware – Software Interface
ADD A B , LOAD, SUB…..
Application software
+, C=5, -
Systems software
User Hardware
Operating system
compiler
assembler
Programs user
writes and runs
16
Instruction Set Architecture (ISA)
✓A set of assembly language instructions (ISA) provides a
link between software and hardware.Ex. C=a+b
✓Given an instruction set, software programmers and
hardware engineers work more or less independently.
✓ISA is designed to extract the most performance out of
the available hardware technology.
✓Defines data transfer modes between registers, memory
and I/O
✓Types of ISA: RISC, CISC, VLIW, Superscalar
✓Examples:
• IBM370/X86/Pentium/K6 (CISC), PowerPC (Superscalar)
• Alpha (Superscalar), MIPS (RISC and Superscalar)
• Sparc (RISC), UltraSparc (Superscalar)
17
RISC: REDUCED INSTRUCTION SET COMPUTERS
Historical Background
IBM System/360, 1964
- The real beginning of modern computer architecture
- Distinction between Architecture and Implementation
- Architecture: The abstract structure of a computer
seen by an assembly-language programmer
Compiler -program
High-Level Instruction
Language Hardware
Set
Architecture Implementation
• Most common microprocessor designs such as the Intel 80x86 and Motorola 68K series followed
the CISC philosophy.
• But recent changes in software and hardware technology have forced a re-examination of CISC
and many modern CISC processors are hybrids, implementing many RISC principles.
• CISC was developed to make compiler development simpler. It shifts most of the burden of
generating machine instructions to the processor. For example, instead of having to make a
compiler write long machine instructions to calculate a square-root, a CISC processor would have
a built-in ability to do this.
COMPLEX INSTRUCTION SET COMPUTERS: CISC
20
CHARACTERISTICS OF RISC
RISC Characteristics
- Relatively few instructions
- Relatively few addressing modes
- Memory access limited to load and store instructions
- All operations done within the registers of the CPU
- Fixed-length, easily decoded instruction format
- Single-cycle instruction format
- Hardwired rather than microprogrammed control
21
Contd.
The main characteristics of CISC microprocessors are:
• Extensive instructions.
• Complex and efficient machine instructions.
• Micro encoding of the machine instructions.
• Extensive addressing capabilities for memory operations.
• Relatively few registers.
In comparison, RISC processors are more or less the opposite of the above:
• Reduced instruction set.
• Less complex, simple instructions.
• Hardwired control unit and machine instructions.
• Few addressing schemes for memory operands with only two basic instructions, LOAD and
STORE
Many symmetric registers which are organised into a register file. 22
CISC versus RISC
CISC RISC
25
Performance
✓Processor time to execute a program depends on the
hardware involved in the execution of individual machine
instructions.
Main Cache
memory Processor
memory
Bus
26
Performance
⚫ The processor and a relatively small cache
memory can be fabricated on a single
integrated circuit chip.
⚫ Speed
⚫ Cost
⚫ Memory management
27
Processor Clock
✓ Clock, clock cycle, and clock rate
✓ The execution of each instruction is divided
into several steps, each of which completes
in one clock cycle.
✓ Hertz – cycles per second
28
Basic Performance Equation
⚫ T – processor time required to execute a program that has been
prepared in high-level language
⚫ N – number of actual machine language instructions needed to
complete the execution (note: loop)
⚫ S – average number of basic steps needed to execute one
machine instruction. Each step completes in one clock cycle
⚫ R – clock rate
⚫ Note: these are not independent to each other
N S
T=
R
29
Clock Rate
⚫ Increase clock rate
➢ Improve the integrated-circuit (IC) technology to make
the circuits faster
➢ Reduce the amount of processing done in one basic step
(however, this may increase the number of basic steps
needed)
⚫ Increases in R that are entirely caused by
improvements in IC technology affect all
aspects of the processor’s operation equally
except the time to access the main memory.
30
Compiler
⚫ A compiler translates a high-level language program
into a sequence of machine instructions.
⚫ To reduce N, we need a suitable machine instruction
set and a compiler that makes good use of it.
⚫ Goal – reduce N×S
⚫ A compiler may not be designed for a specific
processor; however, a high-quality compiler is
usually designed for, and with, a specific processor.
31
Performance Measurement
⚫ T is difficult to compute.
⚫ Measure computer performance using benchmark programs.
⚫ System Performance Evaluation Corporation (SPEC) selects and publishes
representative application programs for different application domains, together
with test results for many commercially available computers.
i=1
32
Amdahl's Law
Amdahl's law states that performance improvement to be gained by using
a faster mode of execution is limited by the fraction of time the faster
mode can be used. Using this law, the performance gain that can be
obtained by improving some portion of the computer can be calculated
using the following formula.
• To add two numbers in register R1 and R2 and to place their sum in register
R3
Program to evaluate X = (A + B) * (C + D) :
ADD R1, A, B /* R1 M[A] + M[B] */
ADD R2, C, D /* R2 M[C] + M[D] */
MUL X, R1, R2 /* M[X] R1 * R2 */
Two-Address Instructions:
Program to evaluate X = (A + B) * (C + D) :
• Uses the register and absolute modes. The processor registers are used as temporary
storage locations where the data in a register are accessed using the Register mode. The
Absolute mode can represent global variables in a program. A declaration such as
Integer A, B
• Immediate mode Address and data constants can be represented in
assembly language using the Immediate mode. The operand is given
explicitly in the instruction.
• Example
• Move #200, R0
399 450
XR = 100
400 700
AC
500 800
600 900
Addressing Effective Content
Mode Address of AC
Direct address 500 /* AC (500) */ 800 702 325
Immediate operand - /* AC 500 */ 500
Indirect address 800 /* AC ((500)) */ 300
Relative address 702 /* AC (PC+500) */ 325 800 300
Indexed address 600 /* AC (XR+500) */ 900
Register - /* AC R1 */ 400
Register indirect 400 /* AC (R1) */ 700
Autoincrement 400 /* AC (R1)+ */ 700
Autodecrement 399 /* AC -(R) */ 450
60
DATA TRANSFER INSTRUCTIONS
Typical Data Transfer Instructions
Name Mnemonic
Load LD
Store ST
Move MOV
Exchange XCH
Input IN
Output OUT
Push PUSH
Pop POP
Data Transfer Instructions with Different Addressing Modes
Assembly
Mode Convention Register Transfer
Direct address LD ADR AC M[ADR]
Indirect address LD @ADR AC M[M[ADR]]
Relative address LD $ADR AC M[PC + ADR]
Immediate operand LD #NBR AC NBR
Index addressing LD ADR(X) AC M[ADR + XR]
Register LD R1 AC R1
Register indirect LD (R1) AC M[R1]
Autoincrement LD (R1)+ AC M[R1], R1 R1 + 1
Autodecrement LD -(R1) R1 R1 - 1, AC M[R1]
61
DATA MANIPULATION INSTRUCTIONS
Three Basic Types: Arithmetic instructions
Logical and bit manipulation instructions
Shift instructions
Arithmetic Instructions
Name Mnemonic
Increment INC
Decrement DEC
Add ADD
Subtract SUB
Multiply MUL
Divide DIV
Add with Carry ADDC
Subtract with Borrow SUBB
Negate(2’s Complement) NEG
Clock Input
R1
R2
R3
R4
R5
R6
R7
Load
(7 lines)
SELA { MUX MUX } SELB
3x8 A bus B bus
decoder
SELD
OPR ALU
Output
64
OPERATION OF CONTROL UNIT
The control unit directs the information flow through ALU by:
- Selecting various Components in the system
- Selecting the Function of ALU
Example: R1 <- R2 + R3
[1] MUX A selector (SELA): BUS A R2
[2] MUX B selector (SELB): BUS B R3
[3] ALU operation selector (OPR): ALU to ADD
[4] Decoder destination selector (SELD): R1 Out Bus
3 3 3 5
Control Word SELA SELB SELD OPR
FULL EMPTY
Stack pointer 4
SP C 3
B 2
A 1
Push, Pop operations 0
DR
/* Initially, SP = 0, EMPTY = 1, FULL = 0 */
PUSH POP
SP SP + 1 DR M[SP]
M[SP] DR SP SP - 1
If (SP = 0) then (FULL 1) If (SP = 0) then (EMPTY 1)
EMPTY 0 FULL 0
67
MEMORY STACK ORGANIZATION
1000
Program
Memory with Program, Data, PC (instructions)
and Stack Segments
Data
AR (operands)
SP 3000
stack
3997
3998
3999
4000
- A portion of memory is used as a stack with a 4001
processor register as a stack pointer DR
- PUSH: SP SP - 1
M[SP] DR
- POP: DR M[SP]
SP SP + 1
68
Problems
• For the following processor, obtain the performance.
• Clock rate = 800 MHz
• No. of instructions executed = 1000
• Average no of steps needed / machine instruction = 20
Solution:
• 25 micro sec
Examples:1
• Registers R1 and R2 of a computer contains the decimal values 1200
and 4600. What is the effective- address of the memory operand in
each of the following instructions?
• (a) Load 20(R1),R5
• (b) Move#3000,R5
• (c) StoreR5,30(R1,R2)
• (d) Add-(R2),R5
Solution
• EA = [R1]+Offset=1200+20 = 1220
• (b) EA = 3000
• (c) EA = [R1]+[R2]+Offset = 1200+4600+30=5830
• (d) EA = [R2]-1 = 4599