Professional Documents
Culture Documents
Module 5: Coordination
Time and Global States – Synchronizing Physical Clocks – Logical Time and Logical Clock – Coordination and
Agreement – Distributed Mutual Exclusion – Election Algorithms – Consensus and Related Problems
Module 6:Distributed Transactions
Transaction And Concurrency Control – Nested Transactions – Locks – Optimistic Concurrency Control –
Timestamp Ordering Distributed Transactions – Flat and Nested – Atomic – Two Phase Commit Protocol –
Concurrency Control
Module 7: Distributed System Architecture & its Variants
Distributed File System: Architecture – Processes – Communication Distributed Web-based System:
Architecture – Processes – Communication. Overview of Distributed Computing Platforms.
Module 8: Recent Trends
Text Books
1. Grama, A. (2013). Introduction to parallel computing. Harlow, England: Addison-Wesley.
2. Coulouris, G. (2012). Distributed systems. Boston: Addison-Wesley.
Reference Books
1. Hwang, K., & Briggs, F. (1990). Computer Architecture and Parallel Processing (1st ed.). New York:
McGraw-Hill. 2. Pradeep K. Sinha, “Distributed Operating System: Concepts and Design”, PHI Learning
Pvt. Ltd., 2007.
2. Hager, G. (2017). Introduction To high performance computing for scientists and engineers: CRC press.
3. Tanenbaum, A. (2007). Distributed Systems: Principles and Paradigms (2nd ed.). Pearson.
“Parallelism is the future of computing”
Motivation
• Transmission speed
Parallel computing has made a tremendous impact on a variety of areas ranging from computational
simulations for scientific and engineering applications to commercial applications in data mining and
transaction processing.
Scientific Applications
Functional and structural characterization of genes and proteins, sequencing of the human genome,
Analysing biological sequences with a view to developing new drugs, etc.
Commercial Applications
cost-effective servers, Linux clusters, etc.
• Most modern computers, particularly those with graphics processor units (GPUs) employ SIMD
instructions and execution units.
Concepts and Terminology
Concepts and Terminology
Multiple Instruction, Single Data (MISD)
Synchronization
The coordination of parallel tasks in real time, very often associated with
communications.
Often implemented by establishing a synchronization point within an application
where a task may not proceed further until another task(s) reaches the same or
logically equivalent point
Synchronization usually involves waiting by at least one task, and can therefore cause a
parallel application's wall clock execution time to increase
Some important terminologies
NNP 38
Multicore Processors - Introduction
A processor core is a processing unit which read instructions to perform specific operations
A multi-core processor is a single computing component with two or more independent processing
units called cores, which read and execute program instructions
A many-core processor is a system with hundreds or thousands of cores and implements parallel
architecture
Core types:
1. Homogeneous (symmetric) cores
2. Heterogeneous (asymmetric) cores – have a mix of core types that run different OS..
NNP 39
Single Core CPU
NNP 40
Why Multi-core?
Performance of a processor depends on
• No. of instructions executed
• Clock cycles per instruction
• Clock frequency / period
NNP 41
Multi-core architecture
NNP 42
About Multi-core…..
• First multi-core processor by Intel in 2005
• MIMD
• It uses either homogeneous / heterogeneous configuration
Various generations:
• Dual core - eg. Intel Core Duo
• Quad core – eg. Intel i5 processor
• Hexa core - eg. Intel i7 extreme edition 980x processor
• Octa core – eg. Intel Xeon E7-2820 processor
• Deca core – eg. Intel Xeon E7 – 2850 processor
NNP 43
Example: Quad-core processor
NNP 44
NNP 45
NNP 46
Homogeneous Multicore Processor
Heterogeneous Multicore Processor
NNP 49
Pro’s Cons
• Isolation • Non-determinism
• Obsolescence Avoidance
• Hardware cost
Shared vs Distributed Memory
Shared Memory Architecture
• The presence of large, efficient instruction and data caches reduces the load a single processor puts
on the memory bus and memory, allowing these resources to be shared among on multiple
processors.
• Private data are data items used only by a single processor, while shared data are data values used
by multiple processors
Uniform Memory Access (UMA)
• The shared memory is physically distributed among all the processors, called local memories
• The collection of all local memories forms a global address space which can be accessed by all the
processors.
Disadvantages:
• Lack of scalability between memory and CPUs. Adding more CPUs can geometrically increases
traffic on the shared memory-CPU path, and for cache coherent systems, geometrically increase
traffic associated with cache/memory management.
• Programmer responsibility for synchronization constructs that ensure "correct" access of global
memory.
Distributed Memory Architecture
• Processors have their own local memory.
• Memory addresses in one processor do not map to another processor, so there is no concept of global
address space across all processors.
• Changes it makes to its local memory have no effect on the memory of other processors. Hence, the
concept of cache coherency does not apply.
• When a processor needs access to data in another processor, it is usually the task of the programmer to
explicitly define how and when data is communicated.
• Synchronization between tasks is likewise the programmer's responsibility.
Advantages:
Memory is scalable with the number of processors.
Increase the number of processors and the size of memory increases proportionately.
Each processor can rapidly access its own memory without interference and without the overhead incurred
with trying to maintain global cache coherency.
Cost effectiveness: can use commodity, off-the-shelf processors and networking.
Distributed Memory Architecture
Distributed Memory Architecture
Disadvantages:
The programmer is responsible for many of the details associated with data communication
between processors.
It may be difficult to map existing data structures, based on global memory, to this memory
organization.
Non-uniform memory access times - data residing on a remote node takes longer to access than
node local data.
Distributed Shared Memory Architecture
Distributed Shared Memory Architecture
The largest and fastest computers in the world today employ both shared and distributed memory
architectures.
The shared memory component can be a shared memory machine and/or graphics processing
units (GPU).
The distributed memory component is the networking of multiple shared memory/GPU machines,
which know only about their own memory - not the memory on another machine. Therefore,
network communications are required to move data from one machine to another.