Distributed Computing: Beakal Gizachew Assefa

Distributed Computing
Beakal Gizachew Assefa

(beakalassefa@gmail.com)
Course Objectives and Prerequisite
Distributed systems: a significant field of computer science and
an active area of research.
Objectives:
Understanding of the concepts that underlie distributed computing
systems along with design and implementation issues
 Key mechanisms, algorithms and models of distributed systems
Practical work: mechanisms such as client/server and peer-to-peer
communication, remote procedure call (RPC/RMI), multicasting,
consistency, P2P algorithms, cloud computing, and platforms such as
AWS, PlanetLab, Cassandra, Redis, MS Azure.
Prerequisite:None
Operating Systems) Knowledge of
Java, C/C++ programming
Textbook and Course Material
Distributed Systems:
Principles and Paradigms, 3/e,
Andrew S. Tanenbaum,
Maarten van Steen, ISBN 0-13-
239227-5, Prentice Hall, 2017.
Get your personalized digital

copy of the book here
e-learning platform
Assignment and Project submissions
Learning outcomes
Students who complete this course successfully are
expected to:
 gain extensive knowledge on principles and concepts of
distributed computing systems including the middleware,
peer-to-peer distributed algorithms, communication,
coordination, consistency, replication and fault-tolerance.
 understand key mechanisms and models for distributed
system design and analysis.
 learn how to implement and evaluate distributed services
using software tools and distributed platforms.
 practice with design and implementation of distributed
cloud services and algorithms.
4
Course Components
4 Programming Assignments 50%

Term Project 25%
Final Exam 25%
Attendance and Policy

Students taking this course are required to attend classes on time. Random
attendance will be taken.
PlanetLab is a global research network platform that supports
the development of new distributed services. Since the beginning of
2003, more than 1,000 researchers have used PlanetLab to develop
new technologies for distributed storage, network mapping, peer-to-
peer systems, distributed hash tables, and query processing.
PlanetLab currently consists of 1353 nodes at 717 sites.
MS Azure is an open and flexible cloud platform that enables you
to quickly build, deploy and manage applications across a global
network of Microsoft-managed datacenters. You can build
applications using any language, tool or framework.
Cloud computing, Azure Global Data Centers

Amazon Web Service (AWS)
On-demand cloud computing platforms.
Highly available clusters of virtual machines through the world.
In this course: Amazon EC2 (Elastic Compute Cloud)

Enables distributed applications/services to be executed on a set of
distributed virtual machines across the globe.
Motivation
 Distributed systems:
Significant field of computer science
Evolving research area
 Forms a good knowledge and practical base

 Well-prepared course projects can be
submitted to a conference for publication
 Successful project work can be extended to be
presented at a conference
Contents
 Introduction: definition, middleware, characteristics, and goals of a distributed
system, scalability dimensions and techniques
 Distributed algorithms and features, case study: logical clocks, happens-before
relation, Lamport’s algorithm, total ordering of events
 Decentralized architectures, P2P systems: structured, unstructured, hybrid
 DHT (Distributed Hash Tables): Chord system, logical ring, finger table, lookup
 Threads in distributed systems, virtualization, server clusters, AWS, PlanetLab
 Multicast Communication: basic multicast, reliable multicast, ordered multicast,
types of ordering, implementation algorithms
 Vector timestamps, causality, guaranteeing causal message delivery, case
study
 Global state, cuts, distributed snapshot algorithm, example usage
Contents (cnt’d)
 Election algorithms: assumptions and goal, bully algorithm, ring algorithm
 Distributed mutual exclusion: centralized, distributed and token ring algorithms
 Distributed deadlocks: deadlock detection algorithms: centralized and
distributed, Deadlock prevention algorithms: wait-die and wound-wait
 RPC/RMI (Remote Procedure Call/Remote Method Invocation)
communication
 Fault tolerance: failure models, failure masking by redundancy, failure masking
and replication, agreement in faulty systems
 Distributed commit: two-phase commit, three-phase commit
 Distributed recovery: checkpointing, message logging
 Distributed transactions and concurrency control
 Consistency and replication, consistency protocol
 Cloud Computing
Introduction
Slides adopted from the instructional material of the textbook

Distributed Systems: Principles and Paradigms by Andrew S. Tanenbaum, Maarten van Steen.
Distributed System: Definition (1)
Autonomous computing elements
Coherent system
Middleware: the OS of distributed systems
The middleware layer extends over multiple machines, and offers each application the same interface.
Distributed System: Definition (2)
• A network is the infrastructure that links the computers,
workstations, terminals, servers, etc.
• A network application is one that fetches needed data from
servers over the network.
• A distributed system is a more complex software
designed to run on a network. Such a system has multiple
processes that cooperate to do some task.
How do distributed systems differ
from network applications?
• Distributed systems may have many components but are
often designed to mimic a single, non-distributed process
running at some single place.
• State is spread around in a distributed system
• Network application is free-standing and self-centered.
• Distributed system is spread out and decentralized.
Design Goals
Design Goal 1. Sharing resources
Design Goal 2: Distribution transparency
Degree of distribution transparency
Observation
Aiming at full distribution transparency may be too much:
There are communication latencies that cannot be hidden
Completely hiding failures of networks and nodes is (theoretically and
practically) impossible
You cannot distinguish a slow computer from a failing one
You can never be sure that a server actually performed an operation
before a crash
Full transparency will cost performance, exposing distribution of the
system
Keeping replicas exactly up-to-date with the master takes time
Immediately flushing write operations to disk for fault tolerance
Degree of distribution transparency
Design Goal 3: Openness of Distributed
Systems
Policies vs Mechanisms
On Strict Separation b/n Policy and
Mechanism
Design Goal 4: Scale in Distributed
Systems
Size Scalability
Formal Analysis
Formal Analysis of Size scaling problem
Formal Analysis
Problem with Geographical Scaling
Problem with Administrative Scaling
Techniques for scaling
Hide communication latencies

Make use of asynchronous communication
Have separate handler for incoming response
Problem: not every application fits this model
Facilitate solution by moving computations to client
Client Server
M
FIRST NAME MAARTEN A
LAST NAME VAN STEEN A
MVS@VAN-STEEN.NET R
E-MAIL T
E
N
Check form Process form

Client Server
FIRST NAME MAARTEN

MAARTEN
LAST NAME VAN STEEN VAN STEEN
E-MAIL MVS@VAN-STEEN.NET MVS@VAN-STEEN.NET
Check form Process form

Partition data and computations across multiple machines

Move computations to clients (Java applets)
Decentralized naming services (DNS)
Decentralized information systems (WWW)

Replication and caching: Make copies of data available at different machines

Replicated file servers and databases
Mirrored Web sites

Web caches (in browsers and proxies)
File caching (at server and client)
Scaling: The problem with replication
Applying replication is easy, except for one thing

Having multiple copies (cached or replicated), leads to inconsistencies:
modifying one copy makes that copy different from the rest.
Always keeping copies consistent and in a general way requires global
synchronization on each modification.
Global synchronization precludes large-scale solutions.
Observation
If we can tolerate inconsistencies, we may reduce the need for global
synchronization, but tolerating inconsistencies is application dependent.
Architecture Styles
(Self Study)
Architectural styles
Basic idea
A style is formulated in terms of
(replaceable) components with well-defined interfaces
the way that components are connected to each other
the data exchanged between components
how these components and connectors are jointly configured into a
system.
Connector
A mechanism that mediates communication, coordination, or cooperation
among components. Example: facilities for (remote) procedure call,
messaging, or streaming.
40
Styles of Architecture Cont.
• Layered Architecture
• Object Oriented and Service Oriented
• RESTful architectures
• Publish-subscribe architectures
Layered architecture
Different layered organizations
Request/Response
downcall One-way call
Layer N Layer N Layer N
Layer N-1 Layer N-1 Layer N-1
Handle
Upcall
Layer N-2
Layer N-2
Layer 2
Layer N-3
Layer 1
(a) (b) (c)

Application Layering
Traditional three-layered view

Application-interface layer contains units for interfacing to users or
external applications
Processing layer contains the functions of an application, i.e., without
specific data
Data layer contains the data that a client wants to manipulate through the
application components
Traditional three-layered view

Application-interface layer contains units for interfacing to users or
external applications
Processing layer contains the functions of an application, i.e., without
specific data
Data layer contains the data that a client wants to manipulate through the
application components
Observation
This layering is found in many distributed information systems, using traditional
database technology and accompanying applications.
Example: a simple search engine
User-interface
User interface
level
HTML page
Keyword expression containing list
HTML
generator Processing
Query Ranked list level
generator of page titles
Ranking
Database queries algorithm
Web page titles

with meta-information
Database Data level
with Web pages
Object-based style
Essence
Components are objects, connected to each other through procedure calls.
Objects may be placed on different machines; calls can thus execute across a
network.
State
Object Object
Method
Method call
Object
Object
Object
Interface
Encapsulation
Objects are said to encapsulate data and offer methods on that data without
revealing the internal implementation.
RESTful architectures
Essence
View a distributed system as a collection of resources, individually managed by
components. Resources may be added, removed, retrieved, and modified by (remote)
applications.
1. Resources are identified through a single naming scheme
2. All services offer the same interface
3. Messages sent to or from a service are fully self-described
4. After executing an operation at a service, that component forgets everything about
the caller
Basic operations
Operation Description
PUT Create a new resource
GET Retrieve the state of a resource in some representation
DELETE Delete a resource
POST Modify a resource by transferring a new state
Example: Amazon’s Simple Storage
Service
Essence
Objects (i.e., files) are placed into buckets (i.e., directories). Buckets cannot be
placed into buckets. Operations on ObjectName in bucket BucketName require
the following identifier:
http://BucketName.s3.amazonaws.com/ObjectName
Typical operations
All operations are carried out by sending HTTP requests:
Create a bucket/object: PUT, along with the URI
Listing objects: GET on a bucket name
Reading an object: GET on a full URI
Publish-Subscribe Architectures
Event-based and Shared data space
Component Component Component Component
Subscribe Notification
Publish Subscribe Data
delivery
delivery
Event bus
Publish
Component
System
Architectures
• Centralized
• Distributed
Centralized Architectures
Multi-tiered Centralized Architectures
terminal only displays whatever server
sends
client can send requests
ex: database server - application server - web server

ex: client - application server - database server
in 3-tiered,
separate these
two
Being client and server at the same time
Alternative organizations
data sharding
(Reading: Section 2.3 of the textbook)

Distributed Computing: Beakal Gizachew Assefa

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Distributed Computing: Beakal Gizachew Assefa

Uploaded by

Copyright:

Available Formats

Distributed Computing

Beakal Gizachew Assefa

Get your personalized digital

4 Programming Assignments 50%

Attendance and Policy

Cloud computing, Azure Global Data Centers

In this course: Amazon EC2 (Elastic Compute Cloud)

 Forms a good knowledge and practical base

Slides adopted from the instructional material of the textbook

Hide communication latencies

Check form Process form

FIRST NAME MAARTEN

Check form Process form

Partition data and computations across multiple machines

Decentralized information systems (WWW)

Replication and caching: Make copies of data available at different machines

Mirrored Web sites

Applying replication is easy, except for one thing

Layer N Layer N Layer N

Layer N-1 Layer N-1 Layer N-1

(a) (b) (c)

Traditional three-layered view

Traditional three-layered view

Example: a simple search engine

Web page titles

Event-based and Shared data space

Component Component Component Component

ex: database server - application server - web server

(Reading: Section 2.3 of the textbook)

You might also like