Professional Documents
Culture Documents
2> for each part of this exercise, assume the initial cache
and memory state as illustrated in Figure 5.35. Each part of this exercise specifies a
sequence of one or more CPU operations of the form:
P#: <op> <address> [<value>] where P# designates the CPU (e.g., P0), <op> is the
CPU operations (e.g., read or write), <address> denotes the memory address, and
<value> indicates the new word to be assigned on a write operation. Treat each action
below as independently applied to the initial state as given in Figure 5.35. What is the
resulting state (i.e., coherence state, tags, and data) of the caches and memory after
the given action?
Show only the blocks that change; for example, P0.B0: (I, 120, 00 01) indicates that
CPU P0s block B0 has the final state of I, tag of 120, and data words 00 and 01. Also,
what value is returned by each read operation?
5.3 [20] <5.2> Many snooping coherence protocols have additional states, state
transitions, or bus transactions to reduce the overhead of maintaining cache
coherency. In Implementation 1 of Exercise 5.2, misses are incurring fewer stall cycles
when they are supplied by cache than when they are supplied by memory. Some
coherence protocols try to improve performance by increasing the frequency of this
case. A common protocol optimization is to introduce an Owned state (usually denoted
O). The Owned state behaves like the Shared state in that nodes may only read
Owned blocks, but it behaves like the Modified state in that nodes must supply data on
other nodes read and write misses to Owned blocks. A read miss to a block in either
the Modified or Owned states supplies data to the requesting node and transitions to
the Owned state. A write miss to a block in either state Modified or Owned supplies
data to the requesting node and transitions to state Invalid. This optimized MOSI
protocol only updates memory when a node replaces a block in state Modified or
Owned. Draw new protocol diagrams with the additional state and transitions.
Protocol diagrams
Shared
Invalid CPU READ
CPU WRITE
5.9 [10/10/10/10/15/15/15/15] <5.4> for each part of this exercise, assume the initial
cache and memory state in Figure 5.38. Each part of this exercise specifies a
sequence of one or more CPU operations of the form:
P#: <op> <address> [ <value> ]
Figure 5.38 Cache and memory states in the multichip, multicore multiprocessor
Where P# designates the CPU (e.g., P0,0), <op> is the CPU operation (e.g., read or
write), <address> denotes the memory address, and <value> indicates the new word to
be assigned on a write operation. What is the final state (i.e., coherence state,
sharers/owners, tags, and data) of the caches and memory after the given sequence of
CPU operations has completed? Also, what value is returned by each read operation?
L1 miss, will replace B0 in L1 of P0,0 which has address 100, and return 0x0020.
L2 miss, will replace B0 in L2 which has address 100.
P1,0: write 120 80
L1 miss, will replace B0 in L1 of P1,0 which has address 130. Invalidate other copies.
L2 hit.
P0,0 L1 will have 120 in B0 (invalid)
P0,0 L1 will have 120 in B0 (modified)
P3,1 L1 will have 120 in B0 (invalid)
L2 S,0will have 120 in B0 (DM, P0,0)
L2 S,1will have 120 in B0 (D I, -)
Memory directory entry for 120 will become <D M, C0 >
L1 miss, will replace B0 in L1 of P0,0 which has address 100. Invalidate other copies.
L2 miss, will replace B0 in L2 which has address 100.
P1,0: read 120
L1 miss, change to shared state and return 0x0080
L2 hit.
P0,0 L1 will have 120 in B0 (shared)
P0,0 L1 will have 120 in B0 (shared)
P3,1 L1 will have 120 in B0 (invalid)
L2 S,0 will have 120 in B0 (DS, P0,0;P1,0)
L2 S,1 will have 120 in B0 (DI, -)
Memory directory entry for 120 will become <DS, C0>
h. [15] <5.4> P0,0: write 120 80
P1,0: write 120 90
L1 miss, will replace B0 in L1 of P0,0 which has address 100. Invalidate other copies.
L2 miss, will replace B0 in L2 which has address 100.
P1,0: write 120 90
L1 miss, will replace B0 in L1 of P1,0 which has address 130. Invalidate other copies.
L2 hit.
P0,0 L1 will have 120 in B0 (invalid)
P0,0 L1 will have 120 in B0 (modified)
P3,1 L1 will have 120 in B0 (invalid)
L2 S,0 will have 120 in B0 (DM, P1,0)
L2 S,1will have 120 in B0 (D I, -)
Memory directory entry for 120 will become <DM, C0>