Professional Documents
Culture Documents
Muhamed Mudawar
Computer Engineering Department
King Fahd University of Petroleum and Minerals
Outline of this Presentation
Shared Memory Multiprocessor Organizations
Cache Coherence Problem
Cache Coherence through Bus Snooping
2-state Write-Through Invalidation Protocol
$ $
Interleaved
Cache
Interconnection network
Interleaved
Mem Mem
Main memory
P1 Pn P1 Pn
$ $ $ $
Mem Mem
Crossbar switch
(Interleaved)
Hit time increases Main memory
2. Writes to the same location X are serialized; two writes to same location
X by any two processors are seen in the same order by all processors
Two properties
Write propagation: writes become visible to other processors
Write serialization: writes are seen in the same order by all processors
Cache-memory
Mem I/O devices
transaction
$ $
Multiple simultaneous readers of Bus
block, but write invalidates them
Mem I/O devices
u :5 u :5 u = 7
I/O devices
1
2
u :5
u=7
Memory
States
Invalid, Valid (clean), Modified (dirty) M
Bus Transactions
Bus Read (BusRd), Write-Back (BusWB) PrWr/—
Replace/BusWB
Only cache-block are transfered
Can be adjusted for cache coherence PrWr/BusRd V
Invalidate block I
Flush data on bus if block is modified
Cache Coherence - 23 Muhamed Mudawar – CSE 661
Example on MSI Write-Back Protocol
P1 P2 P3
u S
I 75 u S 7 u M
S 5
7
u: 5 7
This is the case even when a block is private to a process and not shared
A block in the exclusive state can be written without accessing the bus
Shared signal S
wired-OR
I/O devices
Memory
S
Will invalidate block PrRd/
BusRdX/C2C
Replace/—
BusRd(~S)
Will cause a modified block to be flushed BusRdX/—
BusUpgr/—
PrRd/— Replace/—
Cache-to-Cache (C2C) Sharing PrRd/
BusRd/C2C
BusRd(S)
Supported by original Illinois version
Cache rather than memory supplies data
I
PrWr/
Bus Update transaction BusUpd(~S)
BusUpd/
If any other cache has a copy Replace/
BusWB
Update
Replace/
BusWB
It asserts the shared signal S BusRd/C2C
Updates its block Sm M
PrWrMiss/ PrWrMiss/
Goto Sc state BusRd(S); PrWr/BusUpd(~S) BusRd(~S)
BusUpd
Issuing cache goes to PrRd/— PrRd/—
PrWr/BusUpd(S) PrWr/—
Sm state if block is shared BusRd/C2C
Replace/ Update
If the block is not found BusWB Replace/
BusWB
Block is loaded in M state BusRd/C2C
M or Sm state only
Cache Coherence - 36 Muhamed Mudawar – CSE 661
Dragon’s Lower-level Design Choices
Shared-modified state can be eliminated
Main memory is updated on every Bus Update transaction
DEC Firefly multiprocessor
However, Dragon protocol does not update main memory on Bus Update
Only caches are updated
DRAM memory is slower to update than SRAM memory in caches