Professional Documents
Culture Documents
SEM SEM
IV COSS II
SEM
III
Hardware Data
Software
Reference Books:
(R1) Patterson, David A & J L Hennenssy, Computer Organization and
Design – The Hardware/Software Interface, Elsevier, 5th Ed., 2014.
(R2) Randal E. Bryant, David R. O’Hallaron, Computer Systems – A
Programmer’s Perspective, Pearson, 3rd Ed, 2016.
(R3)Tanenbaum, Modern Operating Systems: Pearson New International
Edition, Pearson Education, 2013 (Pearson Online)
(R4)Stallings, Operating Systems: Internals and Design Principles :
International Edition, Pearson Education, 2013 (Pearson Online)
4
11/9/2020
BITS Pilani, Pilani Campus
Evaluation Scheme
5 unit course.
Sl Evaluation Duration Weightage % Nature of
No. Component Component
1 Mid Sem Exam 90 min 30% CB
2 Comprehensive 180 min 40% OB
Examination
3 Quiz ---- 5% OB
4 Assignments --- 25% OB
5
BITS Pilani, Pilani Campus
Assignments
• Two assignments:
• One pre-midsem exam : 13%
• One post-midsem : 12%
• Lab based
• Simulator to be used : CPU-OS simulator
• Open source tool
• Virtual lab (Platify)
Email : krpruthvi@wilp.bits-pilani.ac.in
• Is a complex system
• Is a programmable device
11
• Hardware
• Central Processing Unit (CPU)
• Memory
• I/O devices
• Software
• System Software
• System Management Software
• Tools and Utilities for Developing the software
• Application Software
• General Purpose Software
• Specific Purposed Software
• Processor - memory
– Data transfer between CPU and main memory
• Processor - I/O
– Data transfer between CPU and I/O module
• Data processing
– Some arithmetic or logical operation on data
• Control
– Alteration of sequence of operations
– e.g. jump
• Combination of above
• Classes interrupts:
• Program
• e.g. overflow, division by zero
• Timer
• Generated by internal processor timer
• Used in pre-emptive multi-tasking
• I/O
• from I/O controller
• Hardware failure
• e.g. memory parity error
▪ We need a mechanism to
✓Load the program into main memory
✓Run the program in processor
✓Store the result in persistent storage and
✓Unload the program to release memory [for the
next program to use]
• Process Management
• Memory Management
• Storage Management
• Protection and security
• Is a sequence of bytes
• I/O device such as disks, keyboards, displays, and even
networks, is modeled as a file
• Reading and writing requires set of system calls
Last Class…
Contact List of Topic Title Text/Ref
Hour Book/external
resource
Performance Assessment Class Slides
3 1.5.1. MIPS Rate
1.5.2 Amdahl’s Law
Memory Organization T1, R2
4 Storage Technologies
Random Access Memory
Disk Storage
Solid State Disks
Storage Technology Trends
Example-
A program needs 20 hours using a single processor core, and a particular part of the
program which takes one hour to execute cannot be parallelized.
The remaining 19 hours (p = 0.95) of execution time can be parallelized, then regardless of
how many processors are devoted to a parallelized execution of this program, the
minimum execution time cannot be less than that critical one hour.
1/(1-p)
Hence, the theoretical speedup is limited to at most 20 times
. For this reason, parallel computing with many processors is useful only for highly
parallelizable programs.
Amdahl’s Law
S=Speedup,
f=fraction of time enhancement,
k=speedup of the faster component
Amdahl’s Law
On a large system CPU upgrade makes it faster by 50% for INR 10,000. A disk
drive upgrade of INR 7000 speeds it up by 150%. Evaluate the speedups?
Processes spend 70% in CPU and 30% waiting Disk drives.
Processor upgrade Disk Drive upgrade
=1.304 =1.219
Each 1% of improvement for the processor costs INR333, and for the disk a
1% improvement costs INR318. “Is cost/performance the most important
metric?”
Memory Organization
BITS Pilani
Pilani Campus
Semiconductor Memory
• Key features
– RAM is traditionally packaged as a chip.
– Basic storage unit is normally a cell (one bit per
cell).
– Multiple RAM chips form a memory.
• RAM comes in two varieties:
– SRAM (Static RAM)
– DRAM (Dynamic RAM)
• SRAM and DRAM are volatile memories
– Lose information if powered off.
Main memory
I/O bridge 0
A
Bus interface x A
ALU
R4
Main
memory
I/O bridge x 0
Bus interface x A
Register file
ALU
R4 x
Main memory
I/O bridge 0
Bus interface x A
ALU
R4 y
Main memory
I/O bridge 0
A
Bus interface A
Main memory reads data word y from the bus and stores
it at address A.
ALU
R4 y
main memory
I/O bridge 0
Bus interface y A
Actuator
Electronics
(including a
processor
and memory!)
SCSI
connector
Cylinder k
Surface 0
Platter 0
Surface 1
Surface 2
Platter 1
Surface 3
Surface 4
Platter 2
Surface 5
Spindle
Tracks
Surface
Track k Gaps
Spindle
Sectors
spindle
spindle
spindle
spindle
Read/write heads
move in unison(simultaneous
Performance)
from cylinder to cylinder
Arm
Spindle
After BLUE
read
After BLUE
read
After BLUE Seek for RED Rotational latency After RED read
read
After BLUE Seek for RED Rotational latency After RED read
read
Important points:
– Access time dominated by seek time and rotational
latency.
– First bit in a sector is the most expensive, the
rest are free.
– SRAM access time is about 4 ns/doubleword,
DRAM about 60 ns
• Disk is about 40,000 times slower than SRAM,
• 2,500 times slower then DRAM.
Register file
ALU
System bus Memory bus
I/O Main
Bus interface
bridge memory
Main
Bus interface
memory
I/O bus
Main
Bus interface
memory
I/O bus
Main
Bus interface
memory
I/O bus
Sequential read tput 550 MB/s Sequential write tput 470 MB/s
Random read tput 365 MB/s Random write tput 303 MB/s
Avg seq read time 50 us Avg seq write time 60 us
• Cache Memories
• Generic Cache Memory Organization
• Direct-Mapped Caches
• Fully Associative Caches
• Registers
– In CPU
• Internal or Main memory
– May include one or more levels of cache
– “RAM”
• External memory
– Backing store
Locality of Reference
During the course of the execution of a
program, memory references tend to
cluster
• Temporal locality: Locality in time
• If an item is referenced, it will tend
to be referenced again soon
• Spatial locality: Locality in space
• If an item is referenced, items
whose addresses are close by will
tend to be referenced soon.
Data :
• Access array elements in succession – spatial locality
• Reference to “product” in each iteration – Temporal locality
Instructions :
• Reference instructions in sequence : Spatial locality
• Looping through : Temporal locality
• 16 Bytes main
memory
– How many
address bits are
required?
• Memory block size
is 4 bytes cache is 2 lines
• Cache of 8 Byte ( 4 bytes per Line)
– How many cache
lines?
Main Memory B0
B1
L0 B2
L1 |
|
L2
|
L3 |
|
L4 B12
B15
Cache Memory B23
B63
• Simple
• Inexpensive
• Fixed location for given block
– If a program accesses 2 blocks that map to the
same line repeatedly, cache misses are very high
Find out
d) Number of bits required to identify a word (byte) in a
block?
2 bits
e) Number of bits required to identify block
22 bits
4/13=30.76%
20 bits
a) If the cache hit ratio is 95%, what is the average
access time for a memory reference?
B1
L0 B2
L1 |
|
L2
|
L3 |
|
L4 B12
B15
Cache Memory B23
B63
20 bits
a) If the cache hit ratio is 95%, what is the average
access time for a memory reference?
B1
L0 B2
L1 |
|
L2
|
L3 |
|
L4 B12
B15
Cache Memory B23
B63
i = j modulo v Set #
0%2
1%2
2%2
3%2
4%2
5%2
6%2
7%2
• Random
BITS Pilani, Pilani Campus
Problem 2
Ref 0 4 0 2 1 8 0 1 2 3 0 4
time 0 1 2 3 4 5 6 7 8 9 10 11
L0
L1
L2
L3
H/M
Ref 0 4 0 2 1 8 0 1 2 3 0 4
L0
L1
L2
L3
H/M
Ref 0 4 0 2 1 8 0 1 2 3 0 4
time 0 1 2 3 4 5 6 7 8 9 10 11
L0
L1
L2
L3
H/M
• Associativity :
• higher associativity ➔ more complex hardware
• Higher Associativity ➔ Lower miss rate
• Higher Associativity ➔ reduces average memory
access time (AMAT)
• Cache Size
• Larger the cache size ➔ Lower miss rate
• Larger the cache size ➔ reduces average memory
access time (AMAT)
• Block Size:
• Smaller blocks do not take maximum advantage of
spatial locality.
BITS Pilani, Pilani Campus
Revisiting Locality of reference
N=8
Address 0 1 2 3 4 5 6 7
Contents v0 v1 v2 v3 v4 v5 v6 V7
Access Order 1 2 3 4 5 6 7 8
BITS Pilani, Pilani Campus
Byte Addressable
Stride k – memory and word
length is 1 byte
reference pattern
i Address i Address
0 0000 0 0000
Stride 1
1 0001 1 0001 Stride 2
2 0002 2 0002
3 0003 3 0003
4 0004 4 0004
Address difference
5 0005 5 0005
Stride=----------------------
6 0006 Word Length 6 0006
7 0007 7 0007
8 0008 8 0008
9 0009 9 0009
10 000A 10 000A
11 000B 11 000B
12 000C 12 000C
BITS Pilani, Pilani Campus
Stride k – reference pattern
Byte Addressable
memory and word
length is 2 bytes
i Address i Address
0 0000 0 0000
Stride 1
1 0002 1 0002 Stride 2
2 0004 2 0004
3 0006 3 0006
4 0008 4 0008
Address difference
5 000A 5 000A
Stride=----------------------
6 000C Word Length 6 000C
7 000E 7 000E
8 0010 8 0010
9 0012 9 0012
10 0014 10 0014
11 0016 11 0016
12 0018 12 0018
BITS Pilani, Pilani Campus
Revisiting Locality of reference
M = 2, N=3
Address 0 1 2 3 4 5
Contents a00 a01 a02 a10 a11 a12
Access Order 1 2 3 4 5 6
M = 2, N=3
Address 0 1 2 3 4 5
Contents a00 a01 a02 a10 a11 a12
Access Order 1 3 5 2 4 6
i=0
i=1
i=2
I=3
BITS Pilani, Pilani Campus
Example 2:
i=0
i=1
i=2
I=3
BITS Pilani, Pilani Campus
Home Work – Which one is
better ?
Program 1: Program 2:
for (int i = 0; i < n; i++) { for (int i = 0; i < n; i++) {
z[i] = x[i] – y[i]; z[i] = x[i] – y[i];
z[i] = z[i] * z[i]; }
} for (int i = 0; i < n; i++) {
z[i] = z[i] * z[i];
}
• 3 addresses
– Result, Operand 1, Operand 2
– c = a + b; add c, a, b
– May be a forth - next instruction (usually implicit)
– Needs very long words to hold everything
• 2 addresses
– One address doubles as operand and result
– a = a + b : add a, b
– Reduces length of instruction
– The original value of a is lost.
3
BITS Pilani, Pilani Campus
MIPS Register Set
Types of Registers
• 32 general-purpose registers ($0 – $31)
• Program Counter (PC)
• Two special Purpose register (HI and LO)
– Used to hold the results of integer multiply and
divide instruction.
– integer multiply operation, HI and LO register hold
the 64-bit result.
– integer divide operation, the 32-bit quotient is
stored in the LO and the remainder in the HI
register.
4
BITS Pilani, Pilani Campus
General-Purpose Register >>
Usage Convention
Register Number Intended usage
name
Zero 0 Constant 0
at 1 Reserved for assembler
v0,v1 2,3 Values for function results and Exp evaluation
a0,a1,a2,a3 4-7 Arguments 1-4
t0-t7 8-15 Temporary (not preserved across call)
s0-s7 16-23 Saved temporary( preserved across call)
t8,t9 24,25 Temporary (not preserved across call)
k0,k1 26,27 Reserved for OS kernel
gp 28 Pointer to Global area
sp 29 Stack Pointer
fp 30 Frame pointer(if needed);Otherwise, a saved register $s8
5
ra 31 Return address(used by a procedure call) BITS Pilani, Pilani Campus
Memory Usage
• A program’s address space
consists of three parts:
• Code/Text segment
• Data segment
• Stack segment
12/17/2020
Lecture
9:032(Lucy
PM J. Gudino) 6
BITS Pilani, Pilani Campus
Instruction Format
➢ Fixed-length instruction format
➢ 32-bits long
➢ Three different instruction formats:
1. Immediate (I-type)
2. Jump (J-type)
3. Register (R-type)
➢ Meaning of various fields in the instruction format
– op: Opcode
– rs: The first register source operand
– rt: The second register source operand
– rd: The register destination operand
– shamt / sa: shift amount
– funct: function code 7
BITS Pilani, Pilani Campus
Instruction Format
R-Type:
I-Type:
J-Type:
8
BITS Pilani, Pilani Campus
Addressing modes
9
BITS Pilani, Pilani Campus
Addressing modes
10
BITS Pilani, Pilani Campus
MIPS Instruction Set
11
BITS Pilani, Pilani Campus
A Basic MIPS Implementation
s2 s1
s2 s1
j : jump unconditional
– format : j label
Summary
• lw $s1, 100($s2) • or $s1, $s2, $s3
Examples:
add, sub, and, or, slt ➔ rs, rt, rd
lw $s1, 100($s2)
sw $s1, 100($s2)
lw $s1, 100($s2)
s2 s1
sw $s1, 100($s2)
s2 s1
R-Type I-Type
12/17/2020
Lecture
9:03
12(Lucy
PM J. Gudino) 37
BITS Pilani, Pilani Campus
Datapath
t = max{ti} + tL
Pipeline processor frequency f = 1/ t
= n*k*t
[k+(n-1)]t
= n*k
[k+(n-1)]
n.k
=
k .k + (n − 1)
n
=
k + (n − 1)
n nf
Hk = = = ηf
k + (n − 1)τ k + (n − 1)
I2 FI DI FO EI WO
I3 FI DI FO EI WO
I4 FI DI FO EI WO
I5 FI DI FO EI WO
I6 FI DI FO EI WO
Time 0 1 2 3 4 5 6 7 8 9
Ref 2 3 6 4 3 2 5 7 6 5
L0
L1
L2
L3
M/H
Time 0 1 2 3 4 5 6 7 8 9
Ref 2 3 6 4 3 2 5 7 6 5
L0
L1
L2
L3
M/H
5ns=3h+(1-h)*33
5ns =3h+33-33h
5ns =-30h+33
30h=28
H=28/30=>93%
t = max{ti} + tL
Pipeline processor frequency f = 1/ t
= n*k*t
[k+(n-1)]t
= n*k
[k+(n-1)]
n.k
=
k .k + (n − 1)
n
=
k + (n − 1)
n nf
Hk = = = ηf
k + (n − 1)τ k + (n − 1)
I2 FI DI FO EI WO
I3 FI DI FO EI WO
I4 FI DI FO EI WO
I5 FI DI FO EI WO
I6 FI DI FO EI WO
Time 0 1 2 3 4 5 6 7 8 9
Ref 2 3 6 4 3 2 5 7 6 5
L0
L1
L2
L3
M/H
Time 0 1 2 3 4 5 6 7 8 9
Ref 2 3 6 4 3 2 5 7 6 5
L0
L1
L2
L3
M/H
5ns=3h+(1-h)*33
5ns =3h+33-33h
5ns =-30h+33
30h=28
H=28/30=>93%
⚫ Performance Assessment
⚫ Execution Time
⚫ CPI
⚫ MIPS
⚫ Disk Assessment
⚫ Access Time, Seek Time, Transfer Time
⚫ Cache Memory
⚫ Hit and Miss Ratio
⚫ Cache Mapping Techniques
⚫ Direct Mapping
⚫ Associative Mapping
⚫ CPU – OS Simulator
• Example :
• Suppose Machine X runs a program in 100 seconds and
Machine Y runs the same program in 125 seconds.
a) Which machine is faster?
b) How many times its is faster?
Instruction Performance
Given Data :
• Clock speed of the processor = 40MHz
• No. of Instructions the executed program consists of =
100000
Given Data:
▪ Number of surfaces = 16
▪ Number of tracks per surface = 128
▪ Number of sectors per track = 256
▪ Number of bytes per sector = 512 bytes
▪ Seek time = 11.5 msec
Main Memory B0
B1
L0 B2
L1 |
|
L2 |
|
L3 |
B28
L4
B29
B31
Solution-1
• Cache Size = 8192 KB (Given)
• Total No. of Cache Lines = Cache size / Line size
= 8192 KB/128B
= (213.210) / 27 = 216 = 64K lines
• Total bits to represent line field
= 16 bits
F E E D F 0 0 D
11111110111011011111000000001101
Tag bits (9) Line bits (16) Block offset (7)
B0
B1
L0
B2
L1
B3
B4
Cache Memory
B5
B6
B7
0011001
B8
B9
B10
B11
B12
B13
0011001-Miss
B14
B15
25 bits
BITS Pilani, Pilani Campus
Problem 8 (Home Work)
Cache of 64kByte, Cache block of 4 bytes and 16 M Bytes main
memory and associative mapping.
for n = 0 to 10
a(n) = n
next
key = 10
found = 0
for n = 0 to 20
temp = a(n)
if temp = key then
found = 1
writeln("Key Found",temp)
break
end if
next
if found <> 1 then
writeln("Key Not Found")
end if
end
BITS Pilani, Pilani Campus
How to Simulate and Run the
program
Simulator Teaching Language (STL)
a) Execute the above program by setting block size to 2, 4, 8,
16 and 32 for cache size = 8, 16 and 32. Record the
observation in the following table.
3
BITS Pilani, Pilani Campus
MIPS Register Set
Types of Registers
• 32 general-purpose registers ($0 – $31)
• Program Counter (PC)
• Two special Purpose register (HI and LO)
– Used to hold the results of integer multiply and
divide instruction.
– integer multiply operation, HI and LO register hold
the 64-bit result.
– integer divide operation, the 32-bit quotient is
stored in the LO and the remainder in the HI
register.
4
BITS Pilani, Pilani Campus
General-Purpose Register >>
Usage Convention
Register Number Intended usage
name
Zero 0 Constant 0
at 1 Reserved for assembler
v0,v1 2,3 Values for function results and Exp evaluation
a0,a1,a2,a3 4-7 Arguments 1-4
t0-t7 8-15 Temporary (not preserved across call)
s0-s7 16-23 Saved temporary( preserved across call)
t8,t9 24,25 Temporary (not preserved across call)
k0,k1 26,27 Reserved for OS kernel
gp 28 Pointer to Global area
sp 29 Stack Pointer
fp 30 Frame pointer(if needed);Otherwise, a saved register $s8
5
ra 31 Return address(used by a procedure call) BITS Pilani, Pilani Campus
Memory Usage
• A program’s address space
consists of three parts:
• Code/Text segment
• Data segment
• Stack segment
12/17/2020
Lecture
9:032(Lucy
PM J. Gudino) 6
BITS Pilani, Pilani Campus
Instruction Format
➢ Fixed-length instruction format
➢ 32-bits long
➢ Three different instruction formats:
1. Immediate (I-type)
2. Jump (J-type)
3. Register (R-type)
➢ Meaning of various fields in the instruction format
– op: Opcode
– rs: The first register source operand
– rt: The second register source operand
– rd: The register destination operand
– shamt / sa: shift amount
– funct: function code 7
BITS Pilani, Pilani Campus
Instruction Format
R-Type:
I-Type:
J-Type:
8
BITS Pilani, Pilani Campus
Addressing modes
9
BITS Pilani, Pilani Campus
Addressing modes
10
BITS Pilani, Pilani Campus
MIPS Instruction Set
11
BITS Pilani, Pilani Campus
A Basic MIPS Implementation
s2 s1
s2 s1
j : jump unconditional
– format : j label
Summary
• lw $s1, 100($s2) • or $s1, $s2, $s3
Examples:
add, sub, and, or, slt ➔ rs, rt, rd
lw $s1, 100($s2)
sw $s1, 100($s2)
lw $s1, 100($s2)
s2 s1
sw $s1, 100($s2)
s2 s1
R-Type I-Type
12/17/2020
Lecture
9:03
12(Lucy
PM J. Gudino) 37
BITS Pilani, Pilani Campus
Datapath