You are on page 1of 87

Multiprocessors

Processors are connected are connected to memory via bus

Multicomputers

Each processor has its own local memory. Interprocess communication is via message passing.

Distributed Shared Memory


Practice shows that programming multicomputers is much harder than programming multiprocessors. Requirement : Best of both the worlds. There has been considerable research in emulating shared memory on multicomputers. Hence we have page based distributed shared memory.

Distributed Shared Memory


In a DSM system, the address space is divided up into pages (typically 4 KB or 8 KB), with the pages being spread over all the processors in the system. When a processor references an address that is not present locally, a trap occurs, and the operating system fetches the page containing the address and restarts the faulting instruction, which now completes successfully.

Chunks of address space distributed among four machines


Shared global address space 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

0 9

1 8

3 10

4 12

7 14

11

13

15

Memory CPU4

CPU1

CPU2

CPU3

DSM
It is essentially normal paging, except that remote RAM is being used as the backing store instead of the local disk.

DSM operation
When a process on a node wants to access some data from a memory block of the shared memory space

The local memory-mapping manager takes charge of its request. If the memory block containing the accessed data is resident in the local memory, the request is satisfied. Otherwise, a network block fault is generated and the control is passed to the operating system. The OS then sends a message to the node on which the desired memory block is located to get the block.

DSM operation
The missing block is migrated from the remote node to the client processs node and the OS maps it into the application address space. Data blocks keep migrating from one node to another on demand basis, but no communication is visible to the user processes. DSM allows replication/migration of data blocks

DSM
In this example, if processor 1 references instructions or data in pages 0, 2, 5,or 9, the references are done locally. References to other pages cause traps.
A reference to an address in page 10 will cause a trap to the operating system, which then moves page 10 from machine 2 to machine 1

Situation after CPU 1 references chunk 10

11

13

15

10

12 14

CPU1

CPU2

CPU3

CPU4

Situation if chunk 10 is read only and replication is used

11

13

15

10

10

12 14

CPU1

CPU2

CPU3

CPU4

In this way, processors 1 and 2 can both reference page 10 as often as needed without causing traps to fetch missing memory.

DSM
Possible variations : 1. No replication : Exactly one copy of each page. Consistency easy to maintain. 2. Replicate read only copies,single write copy. Consistency easy to maintain.

3. Replicate read-write copies. Possible consistency problems

Issues in Design and Implementation of DSM(Granularity)

1. Granularity : Refers to the size of a block. Possible units are a few words, a page or a few pages. i.e The unit of data transfer across the network when there is a network block fault.

Issues in Design and Implementation of DSM(Granularity)

Factors that influence the size of the Block/page : 1. Paging Overhead : Due to the property of locality of reference(when a word within a page is accessed , words in and around the requested word are likely to be accessed in future) in case of small block size paging overhead increases since more no. of blocks need to be transferred.

Issues in Design and Implementation of DSM(Granularity)

2. Directory size : Larger the block size, smaller the directory(information about the blocks in the system). Hence reduced directory management overhead.

Issues in Design and Implementation of DSM(Granularity)

3. Thrashing : Thrashing Processes spend more time paging than executing. Larger block sizes leads to thrashing since it may happen that data items in the same data block are being updated by multiple nodes at the same time, causing large no. of block transfers without any progress in execution.

Issues in Design and Implementation of DSM(Granularity)

4. False sharing : Two processes on different processors contain data in the same block , hence the block is repeatedly transferred between the two processors. Having data belonging to 2 different processes in the same block is called FALSE SHARING. False sharing leads to thrashing

Issues in Design and Implementation of DSM(Granularity)

Issues in Design and Implementation of DSM(Granularity)

The relative advantages and disadvantages of small and large block sizes makes it difficult to decide on an optimum block size.
Therefore a suitable compromise is USING THE PAGE SIZE AS A BLOCK SIZE

Issues in Design and Implementation of DSM(consistency) 2. Memory Consistency : In a DSM that allows replication of shared data items, copies of shared data items may simultaneously be available in the MM of 2 or more nodes. Advantage of replication : 1. Improves reliability in case of failure of a copy. 2. Improves performance in terms of access time.

Issues in Design and Implementation of DSM(consistency)


Price to be paid for replication : 1. Modifications need to be carried out on all the copies to ensure consistency. 2. Consumes more n/w bandwidth to keep all replicas upto date. 3. A read operation performed at any copy should always return the same result. i.e when an update operation is performed on one copy , the update should be propagated to all copies before any subsequent operation takes place NO MATTER AT WHICH COPY THAT OPERATION IS PERFORMED. Hence the update operation at all copies should be viewed as a SINGLE ATOMIC OPERATION

Issues in Design and Implementation of DSM(consistency)


This requires global synchronization which takes a lot of communication time when replicas are spread across a wide area network. Solution : Loosen the consistency constraints i.e Relax the requirement that updates need to be executed as atomic operations. Price paid : Copies may not always be the same everywhere To what extent consistency can be loosened depends on the application.

Issues in Design and Implementation of DSM(consistency)

Hence we have a number of consistency models. Data Centric consistency models: Assumptions : 1. Data Store Distributed Shared Memory. 2. Each process can access data from the local copy available of the data store. 3. Data operation is classified as write when it changes data otherwise it is classified as a read operation

Issues in Design and Implementation of DSM(consistency)

Issues in Design and Implementation of DSM(consistency)

A consistency model is a contract between processes and the data store. It says that if processes agree to obey certain rules then the store promises to work correctly.

Issues in Design and Implementation of DSM(consistency)

1. Strict Consistency :
Any read on a data item x returns a value corresponding to the result of the most recent write on x. All writes are instantaneously visible to all processes Strict consistency observed in a uniprocessor system. a=1; a=2; print(a); Value for a displayed is 2

Issues in Design and Implementation of DSM(consistency)

A strictly consistent store

A store that is not strictly consistent.


All writes should be instantaneously visible to all processes

which is very difficult when copies are spread wide apart.


The problem with strict consistency is that it relies on absolute global time and is impossible to implement in a distributed system.

Issues in Design and Implementation of DSM(consistency)


2. Sequential Consistency : Weaker than strict consistency. A data store is said to be sequentially consistent if it follows the following condition : The result of any execution is the same as if the read and write operations by all processes were executed in some sequential order and the operations of each individual process appear in this sequence in the order specified by its program. Any valid interleaving is legal but all processes must see the same interleaving.

Issues in Design and Implementation of DSM


For example, to improve query performance, a bank may place copies of an account database in two different cities, say NewYork and San Francisco. A query is always forwarded to the nearest copy. Assume a customer in San Francisco wants to add $ 100 to his account (account number 559), which currently contains $ 1000. At the same time, a bank employee in New York initiates an update by which the customer's account is to be increased with 1 percent interest. Both updates should be carried out at both copies of the database. However, due to communication delays in the underlying network, the updates may arrive in the order as shown

Issues in Design and Implementation of DSM(consistency)


Example :

Issues in Design and Implementation of DSM(consistency)


The customer's update operation is performed in San Francisco before the interest update. In contrast, the copy of the account in New York's replica is first updated with 1 percent interest and after that with the $ 100 deposit.

Issues in Design and Implementation of DSM(consistency) The 2 updates should have been performed in the same order at each copy to achieve consistency(sequential)

Issues in Design and Implementation of DSM(consistency)

a) A sequentially consistent data store.

P3 and P4 disagree on the order of the writes


a) A data store that is not sequentially consistent.

Issues in Design and Implementation of DSM(consistency)


3. Causal consistency
Weaker than sequential consistency since it makes a distinction between events that are potentially causally related and those that are not.

If event B is caused/influenced by an earlier event A--- Then causality requires that everyone see A then see B
Events that are not causally related are called concurrent events.

Issues in Design and Implementation of DSM(consistency)


For a data store to be considered causally consistent, it is necessary that the store obeys the following condition: Writes that are potentially causally related must be seen by all processes in the same order. Concurrent writes may be seen in a different order on different machines.

Issues in Design and Implementation of DSM(consistency)

Issues in Design and Implementation of DSM(consistency)

Issues in Design and Implementation of DSM(consistency)


4. FIFO(PRAM Pipelined RAM): Writes done by a single process are seen by all other processes in the order in which they were issued, but writes from different processes may be seen in a different order by different processes. All writes generated by different processes are concurrent.

Issues in Design and Implementation of DSM(consistency)

A valid sequence of events for FIFO consistency

Issues in Design and Implementation of DSM(consistency)


P1 P2 P3 P4 : W(x)1 : R(x)1 : : W(x)2 R(x)2 R(x)1 R(x)1 R(x)2

Valid sequence of events for FIFO consistency

Issues in Design and Implementation of DSM(consistency)


Ex : If w11 and w12 are 2 write operations performed by a process P1 in that order and if w21 and w22 are 2 write operations performed by a process P2 in that order then P3 can see them in order : [(w11,w12), (w21,w22)] and P4 can see them in order [(w21,w22),(w11,w12)]

Issues in Design and Implementation of DSM(consistency)


Note : In sequential consistency all processes agree on the same order of operations. But in FIFO all processes do not agree on the same order of memory operation. Either : [(w11,w12),(w21,w22)] or [(w21,w22),(w11,w12)] Is acceptable but not both

Issues in Design and Implementation of DSM(consistency)


5. Weak consistency FIFO-- Propagation of all intermediate writes in order to all copies. Alternative -- Let processes finish its critical section(operation on shared memory item) and make sure that the final results are sent everywhere not worrying too much whether all intermediate results have been propagated to all copies in order.

Issues in Design and Implementation of DSM(consistency)


Use a synchronization variable (S) has only a single associated operation called synchronize. The operation synchronize is used to synchronize memory. When a process does a synchronize operation, all writes done on that machine are propagated outward(to other machine) and all writes done on other machines are brought in. In other words all off shared memory is synchronized.

Issues in Design and Implementation of DSM(consistency)


In weak consistency , when a process performs an operation on a shared data item, no guarantees are given about when they will be visible to other processes. Only when explicit synchronization takes place , changes are propagated.

Issues in Design and Implementation of DSM(consistency)

a)

A valid sequence of events for weak consistency.

b)

An invalid sequence for weak consistency.

Issues in Design and Implementation of DSM(consistency)


In (a) P1 does 2 writes to a data item and then synchronizes. Since P2 and P3 are not yet synchronized no guarantees are given about what they see. In (b) P2 has been synchronized which means its local copy of the data store is brought up to date. When P2 reads x, it must get the value b. Getting a is not permitted for weak consistency.

Issues in Design and Implementation of DSM(consistency)


6. Release consistency If it is possible to know the difference between entering a critical region or leaving it, a more efficient implementation might be possible. To do that, two kinds of synchronization operations are needed---- acquire operation - to tell that a critical region is being entered; release operation to tell when a critical region is to be exited

Issues in Design and Implementation of DSM(consistency)


A data store that offers release consistency guarantees that when a process does an acquire, the store will ensure that all the local copies of the protected(shared) data are brought up to date to be consistent with the remote ones if need be. When a release is done, protected data that have been changed are propagated out to the other local copies of the store.

Issues in Design and Implementation of DSM(consistency)

A valid event sequence for release consistency.


P1 does an acquire and changes x twice, then does a release. P2 does an acquire and reads x. It is guaranteed to get b. P3 does not do an acquire before reading the shared data . Hence the data store has no obligation to give it the current value of x. So returning a is allowed.

Issues in Design and Implementation of DSM(consistency)


Release consistency is also called eager release consistency since ------- When a release is done all the processes doing the release pushes out all the modified data to all other processes that have a copy of the data and thus might potentially need it. There is no way to tell if they actually will need it, so all of them get everything that has changed. Variation of eager release --- Lazy release At the time of release nothing is sent anywhere. When an acquire is done, the process trying to do an acquire has to get the most recent values of the data from the process holding them.

Issues in Design and Implementation of DSM(consistency)

Eager Release consistency

Issues in Design and Implementation of DSM(consistency)

Lazy Release consistency

Issues in Design and Implementation of DSM(consistency)


7. Entry consistency
Requires each shared data item to be associated with a synchronization variable (Lock). Synchronization variables are used as follows :

1. Each synchronization variable has a current owner i.e the process that last acquired it.

2. The owner may enter and exit critical region (CR) repeatedly without having to send any messages on the network.

Issues in Design and Implementation of DSM(consistency)


3. A process not currently owning a synchronization variable but wanting to acquire it has to send a message to the current owner asking for ownership and the current values of the data associated with that synchronization variable.
4. It is also possible for several processes to simultaneously own a synchronization variable in a non exclusive mode i.e they can read but not write the associated data.

Issues in Design and Implementation of DSM(consistency)


A data store exhibits entry consistency if it meets all of the following condition : An acquire access of a synchronization variable is not allowed to perform with respect to a process until all updates to the guarded shared data have been performed with respect to that process. Before an exclusive mode access to a synchronization variable by a process is allowed to perform with respect to that process, no other process may hold the synchronization variable, not even in nonexclusive mode. After an exclusive mode access to a synchronization variable has been performed, any other process's next nonexclusive mode access to that synchronization variable may not be performed until it has performed with respect to that variable's owner.

Issues in Design and Implementation of DSM(consistency)

A valid event sequence for entry consistency. P1 does an acquire for x, changes x once after which it also does an acquire for y. P2 does an acquire for x but not for y, so it will read a for x but may read NIL for y. P3 first does an acquire for y, hence it will read value b for y

Summary of weak, release , entry

Implementing Sequential Consistency


Most commonly used model Sequential Consistency.

Protocols depends on whether the DSM system allows replication/migration of shared-memory blocks Different strategies are
Non-Replicated, Non-Migrating blocks (NRNMBs) Non-Replicated, Migrating blocks (NRMBs) Replicated, Migrating blocks (RMBs) Replicated, Non-Migrating blocks (RNMBs)

Implementing Sequential Consistency


1. Non-Replicated, Non-Migrating blocks (NRNMBs)
Simplest strategy Each block of the shared memory has a single copy whose location is always fixed. All access requests to a block from any node are sent to the owner node of the block, which has the only copy of the block. On receiving a request from a client node, the MMU and OS of the owner node return a response to the client.

Implementing Sequential Consistency


Client node
(sends request and receives response)

Owner node of the block


(receives request, performs data access and sends response) Request

Response Data Locating in the NRNMB strategy : there is a singe copy of each block in the system The location of the block never changes

Implementing Sequential Consistency


2. Non replicated , Migrating Blocks (NRMB)
Each block of the shared memory has a single copy in the entire system, however, each access to a block causes the block to migrate from its current node to the node from where it is accessed. The owner node of a block changes as soon as the block is migrated to a new node.

Implementing Sequential Consistency


Client node
(becomes new owner node of the block after migration) Block Request Block migration

Owner node of the block


(owns the block before its migration)

Implementing Sequential Consistency


Advantages :
No communication costs are incurred when a process accesses data currently held locally. If an application exhibits high locality of reference , the cost of data migration is amortized over multiple accessses. Disadvantages : Prone to thrashing i.e a block may keep migrating frequently from one node to another, resulting in few memory accesses between migration. Parallelism is not possible

Implementing Sequential Consistency


Data Locating in the NRMB strategy :
There is a single copy of each block and the location of a block changing dynamically Strategies used to locate the blocks : 1. Broadcasting :
Each node maintains an owned blocks table that contains an entry for each block for which the node is the current owner. When a fault occurs, the fault handler of the faulting node broadcasts a read/write request on the network. The node currently having the requested block then responds to the broadcast request by sending the block to the requested node.

Implementing Sequential Consistency


2. Centralized server Algorithm :
A centralized server maintains a block table that contains the location information for all blocks in the shared memory space The location and identity of centralized server is well known to all nodes . When a fault occurs ,the fault handler of the faulting node(N) sends a request to the centralized server. The centralized sever extracts the location information and forwards it to that node and changes the location info. in the corresponding entry of the block table to node N. On receiving the request, the current owner tranfers the block to node N. Drawbacks : Single point of failure.

Implementing Sequential Consistency


3. Fixed distributed-server algorithm
It is a direct extension of the centralized server scheme. It overcomes the problems of the centralized server scheme by distributing the role of centralized server. It has a block manager on several nodes, and each block manager is given a predetermined subset of data blocks to manage. During a fault, the fault handler of the faulting node finds out the node whose block manager is managing the currently accessed block. Then a request for the block is sent to the block manager of that node. The block manager then handles the request exactly in the same way as the centralized server algorithm.

Implementing Sequential Consistency


4. Dynamic distributed-server algorithm.
It does not use any block manager and attempts to keep track of the ownership information of all blocks in each node. Each node has a block table that contains the ownership information of all blocks. However this ownership information is not correct at all times, but if incorrect it at least provides the beginning of a sequence of nodes to be traversed to reach the true owner node of a block. Hence this field is called the probable owner.

Implementing Sequential Consistency


3. Replicated, Migrating Blocks(RMB)
Disadvantage of non replication strategies- Lack of parallelism (only the processes on one node can access data contained in a block at any given time.) To increase parallelism, all DSM systems replicate blocks. With replicated blocks, read operations can be carried out in parallel with multiple nodes, the average cost of read operation is reduced. However, replication tends to increase the cost of write operations, because for a write to a block all its replicas must be invalidated or updated to maintain consistency.

Implementing Sequential Consistency


Basically there are two protocols for enhancing sequential consistency : 1.Write-Invalidate : All copies except one (on which the write would be performed ) are invalidated before a write can be performed. When a write fault occurs, its fault handler 1. Copies the accessed block from one of the blocks current node to its own. 2. Sends an invalid message to the nodes having a copy of the block. 3. Proceeds to perform the write operation.

Implementing Sequential Consistency


Here only the block which performs the write operation on the block holds the modified version.
After the write operation if one of the nodes having a copy of the block tries to perform a read/write operation, a fault occurs and the fault handler of the node will have to fetch the block again from the node having a valid(updated) copy. Therefore the scheme achieves sequential consistency.

Implementing Sequential Consistency


Write Invalidation Client wants to write:
new copy 2. Replicate block 3. Invalidate block 3. Invalidate block 1. Request block

a copy of block

block

a copy of block

CSS434 DSM

72

Implementing Sequential Consistency


2. Write update : Here the write operation is carried out by updating all copies of the data on which the write is performed. When a write fault occurs, its fault handler 1. Copies the accessed block from one of the blocks current node to its own. 2. Performs the write operation on the local copy and updates all copies of the block. After the write operation completes all the nodes have a valid copy after the write operation. Problem : 2 nodes can simultaneously trigger a write operation. Sequential Consistency requires all processes agree on the order of writes

Implementing Sequential Consistency


Write Update Client wants to write:
new copy 2. Replicate block 3. Update block 3. Update block 1. Request block

a copy of new copy block

new copy block

a copy of new copy block

CSS434 DSM

74

Implementing Sequential Consistency


Solution : Use a global Sequencer
The intended modification is first sent to the global sequencer. The sequencer assigns the next sequence number to the modification and multicasts the modification with this sequence number to all the nodes where a replica of that data block is located. The write operation is processed on each node in the order of the sequence number.

Implementing Sequential Consistency


other nodes having replica

Client node
(has a replica of the data block) Modification

Sequenced modification

Sequenced modification

Sequenced modification Sequencer

Sequenced modification

Implementing Sequential Consistency


4. Replicated , Non migrating Blocks
A shared memory block may be replicated at multiple nodes of the system, but the location of each replica remains fixed. A read/write access to a shared memory block is carried out by sending the request to one of the nodes having the replica of the block. All replicas are kept consistent by updating all of them in case of a write access. A protocol similar to write update is used for this purpose. Sequential consistency is ensured by using a global sequencer to sequence all write operations of all the nodes.

Implementing Sequential Consistency


Characteristics of RNMB
The replica locations of a block never change. All replicas of a data are kept consistent. Only a read request can be directly sent to one of the nodes having a replica of the block and all write requests have to be sent to the sequencer.

Replacement Strategy
3. Replacement Strategy
If the available space for the shared data at a node fills up , then the issues that needs to be addressed is : 1. Which block to replace to make space for the newly required block ? Two categories of replacement algorithms--- a) Usage based v/s non-usage based : Usage based algorithm : Keeps track of a history of usage of a page to make replacement decisions. Eg. LRU b) Non usage based algorithm : Do not keep track of use of a page while making replacement decision. Eg: FIFO/Random
CSS434 DSM 79

Replacement Strategy
b) Fixed space v/s variable space : Fixed space algorithm assume that the memory size is fixed while variable space algorithms are based on the assumptions that memory size can change dynamically depending on the need. Fixed space-Involves selection of a specific page for replacement. Variable space A fetch does not imply a replacement. Variable space concept is not suitable for DSM since each nodes memory that acts as a cache for the virtually shared memory is fixed in size.

Replacement Strategy
One of the first DSM implementation was IVY (Integrated shared Virtual memory at Yale) Each memory block of a node is classified into one of the 5 types :
1. 2. 3. 4. 5. Unused : A memory block that is not currently being used . Nil : A block that has been invalidated. Read-Only : A block for which the node has only read access right. Read-owned. A block for which the node has only read access right but is also the owner of the block. Writable : A block for which the node has write access permission . Obviously, the node is the owner of the block since IVY uses the write-invalidate protocol.

Replacement Strategy
Based on the classification of the block, the following replacement priority is used : Both unused and nil blocks have the highest replacement priority. The read-only blocks have the next replacement priority. Read-owned and writable blocks for which replicas exists on some other nodes have the next replacement priority. Read-owned and writable blocks for which only this node has a copy have the lowest priority.

Replacement Strategy
2. Where to place the replaced block?
Using secondary store Using the memory space of other nodes

Thrashing
4. Thrashing

Occurs when a system spends a large amount of time transferring data blocks than executing on them The larger a block, the more chances of false sharing that causes thrashing. How to avoid thrashing? 1. Provide application controlled locks : Locking data to prevent nodes from accessing that data. A lock could be associated with each data block. 2. Nailing a block to a node for a minimum amount of time. Disallow a block to be taken away from a node until a minimum amount of time t elapses after its allocation to that node. How to choose the value for t ? Statically/dynamically

Thrashing
If fixed statically --- If a process accesses a block only once for writing to it , other processes may be prevented from accessing the block until the time t elapses. On the other hand it may happen that a process accesses the block for several write operations on it, and time t elapses before the process has finished using the block. Hence tuning the value of t dynamically is the preferred approach----Value of t for a block is decided based on the access patterns of the block.

Structure of shared memory space :


5. Structure of shared memory space It refers to the layout of the shared data. It depends on the type of application the DSM is going to handle.

Point to note
If a data store is sequential than always causal,FIFO. If a data store is causal, it will be FiFO