Arba Minch University
Institute of Technology
Faculty of Computing & Software Engineering
Intro to Distributed
Intro to Distributed System
System
(CoSec4038)
Mr. Addisu M. (Asst. Prof) G3 SE - Regular
1
Chapter 5
Naming System
Intro to Distributed System
in DS
2
Introduction
Names play an important role in all computer systems
are used to share resources, uniquely identify entities, refer to
locations, and more information about the given entities
An important issue with naming is that a name can be resolved
to the entity it refers to.
Name resolution thus allows a process to access the named
Intro to Distributed System
entity.
To resolve names, it is necessary to implement a naming system
In DS, the implementation of a naming system is itself often
distributed across multiple machines, unlike in a non DSs
How this distribution is done plays a key role in the efficiency
and scalability of the naming system
3
Some Terminology
Name - a string of bits/characters used to identify/refer to
entities, locations, resources & more in a context
Simply comparing two names, we might not be able to know if
they refer to the same entity
Can be human-friendly (or not) & location dependent (or not)
• File names, www.amu.edu.et, variable names etc. are human-
Intro to Distributed System
friendly names given to each entity
Textual names (human readable)
used to identify individual services, people or
email address: Hans.Mair@inf.unibz.it
URL: www.google.com
groups of people or objects
mailing lists: professors@unibz.it
mail domains (if there are several mail exchangers) 4
Some Terminology
Name - a string of bits/characters used to identify/refer to
entities, locations, resources & more in a context
Numeric addresses (identify the location of an object)
locate individual resources, e.g.
193.206.186.100 (IP host address)
special case: group addresses, e.g.
Intro to Distributed System
multicast and broadcast addresses: IP Multicast, Ethernet
• Question: how to map/resolve these names to addresses so that we
can access the entities on which we want to operate?
• Solution: have a naming system that maintains name-to-
address binding!
Purpose: to identify an entity and act as an access point to it
5
Some Terminology
Entity is anything that has a particular task
• E.g., resources such as hosts, printers, disks, files,
processes, users, mailboxes, web pages, messages, web
pages, network connections, etc
• Entities can be operated on. For example:
• a resource such as a printer offers an interface containing
Intro to Distributed System
operations for printing a document, requesting the status of a
print job, and the like
• an entity such as a network connection may provide
operations for sending and receiving data, setting quality-of-
service parameters, requesting the status, and so forth
• To operate on an entity, it is necessary to access it through its
Access Point
• To use printer, you first access it – cable to connect with PC 6
Some Terminology
Access point - a special kind of entity to access an Entity and its
name is called address (address of entity, e.g., IP, port #, phone #)
• Address of an access point of an entity is also simply called an
address of that entity
• An entity may have more than one access point/address (similar to
accessing an individual through different telephone numbers)
• An entity may change its access points/addresses over time
Intro to Distributed System
• Example: telephone numbers, e-mail addresses, computer move from one to another-
diff IP address, ...
• Using an address as a reference is inflexible and human unfriendly
• A better approach is to use a name that is location independent,
much easier, and flexible to use
7
Some Terminology
Name vs Address
Name: an identifier permanently associated with an object,
independent of its location within the DS
A name is how an endpoint is referenced.
Typically, no structurally significant hierarchy
“Desta”, “Hawasa”, “Banana”
Intro to Distributed System
Address: an identifier associated with the current location
of the object
An address is how you get to an endpoint
Typically, hierarchical (for scaling):
Arba Minch University, Arba Minch City, Ethiopia
10.144.5.30, +251-928 364289,
www.abebe.kebede@amu.edu.et 8
Some Terminology
Example
Telephone as Access Point to a person
Telephone – access point, telephone number – address of
person, person (entity) can have several telephone numbers
(addresses)
In a DS, a host running a specific server - an access point,
Intro to Distributed System
with its address formed by the combination of – an IP
address and port number
In general, don’t want to use an address as a regular name –
coz it changes! So we want location-independent name!
9
Some Terminology
An address is thus just a special kind of name: it refers to an access
point of an entity.
• Because an access point is tightly associated with an entity, it would seem
convenient to use the address of an access point as a regular name for
the associated entity
Nevertheless, this is hardly ever done as such naming is
generally very inflexible and often human unfriendly
Intro to Distributed System
• For example, it is not uncommon to regularly reorganize a distributed system,
so that a specific server is now running on a different host than previously
An entity may easily change an access point, or an access point may
be reassigned to a different entity
If an address is used to refer to an entity, we will have an invalid
reference the instant the access point changes or is reassigned to
another entity
10
Some Terminology
Therefore, it is much better to let a service be known by a
separate name independent of the address of the associated
server
Likewise, if an entity offers more than one access point, it is not
clear which address to use as a reference.
For instance, many organizations distribute their Web service
Intro to Distributed System
across several servers
If we would use the addresses of those servers as a reference for
the Web service, it is not obvious which address should be
chosen as the best one
A much better solution is to have single name for Web
service independent from the addresses of different Web servers
a name for an entity that is independent from its addresses is
often much easier and more flexible to use 11
Some Terminology
Identifier - a special name that uniquely identifies an entity
refers to only one entity
True identifier has the following three properties:
• P1: Each identifier refers to at most one entity
• P2: Each entity is referred to by at most one identifier
• P3: An Identifier always refers to the same entity (no reuse)
Intro to Distributed System
Address: is name of an access point, location of an entity
E.g., phone number, or IP-address + port for a service
Addresses and identifiers are important and used for
different purposes, but they are often represented in
machine readable format (MAC, memory address)
12
Naming System
An important issue with naming is that a name can be
resolved to the Entity it Refers to and Name Resolution thus
allows a process to access the named entity
Naming facility of a distributed OS enables users and
programs to assign character-string names to objects and
Intro to Distributed System
subsequently use these names to refer to those objects
Locating facility, which is an integral part of the naming
facility, maps an object’s name to the object’s location
Naming and locating facilities jointly form a naming system
that provides the users with an abstraction of an object hides
the details of how and where an object is actually located in
the network 13
Naming System
Naming system is a middleware that assists in name
resolution
• It is one of the most important components of a distributed
OS because it enables objects to be identified and accessed
in a uniform manner
• In principle, a naming system maintains a name-to-address
Intro to Distributed System
binding resolve name (or identifier) to address which in its
simplest form is just a table of (name, address) pairs
are classified into three classes based on type of names used:
Flat/Unstructured naming
Structured naming
Attribute-based naming
14
Type of Naming System
Flat Naming
Identifiers are convenient to uniquely represent entities.
In many cases, identifiers are simply random bit strings (which
we conveniently refer to as unstructured/flat names)
Property of flat name – it does not contain any information
whatsoever on how to locate access point of its associated entity
Intro to Distributed System
Good for machines
Example: Internet Address at the Network layer
How to resolve flat names, or, equivalently, how we can
locate an entity when given only its identifier.?
Simple Solutions(Broadcasting and Forwarding Pointers)
Home-Based Solutions.
Distributed Hash Table (Reading)
Hierarchical Approach 15
Type of Naming System
Flat Naming - Broadcasting
Consider a DS built on a computer network: that offers efficient
broadcasting facilities
Approach:
Broadcast the name/address to the whole network; the entity
associated with the name responds with its current identifier
Intro to Distributed System
It requires all processes to listen to incoming location requests
such facilities are offered by local-area networks in which all
machines are connected to a single cable
Locating an entity is simple
message containing the identifier of the entity is broadcast to each machine
and each machine is requested to check whether it has that entity
Only the machines that can offer an access point for the entity send a
reply message containing the address of that access point 16
Type of Naming System
Flat Naming - Broadcasting
Broadcasting becomes inefficient when the network grows
network bandwidth wasted by request messages
too many hosts maybe interrupted by requests they cannot answer
Example: Address Resolution Protocol (ARP)
Resolve an IP address to a MAC address
Intro to Distributed System
Identifies MAC address by broadcasting IP address to all the nodes
For which that IP is related, Who has the address
they reply with their MAC address 192.168.0.1?
x x
Challenges:
Not scalable in large networks
Requires all entities to listen to all requests
I am 192.168.0.1. My identifier is
Solution: switch to multicasting, by which only a 02:AB:4A:3C:59:85
restricted group of hosts receives the request 17
Type of Naming System
Flat Naming - Forwarding Pointers
Another popular approach to locating mobile entities is to make
use of forwarding pointers and the principle is simple: when an
entity moves from A to B, it leaves behind in A a reference to its
new location at B
Intro to Distributed System
The main advantage of this approach is its simplicity: as soon as
an entity has been located, for example by using a traditional
naming service, a client can look up the current address by
following the chain of forwarding pointers
Example: postal service
One person moves from Arba Minch Hawassa Addis
Ababa Adama ……. 18
Type of Naming System
Flat Naming - Forwarding Pointers
Each forwarding pointer is implemented as a pair:
(client stub, server stub)
server stub contains a local reference to the actual object or a
local reference to another client stub
When object moves from A (e.g., P2) to B (e.g., P3),
Intro to Distributed System
It leaves a client stub at A (i.e., P2)
It installs a server stub at B (i.e., P3)
Process P1 Process P2
Process P3
Process P4
n = Process n; = Remote Object; = Caller Object; = Server stub; = Client stub
19
Type of Naming System
Flat Naming - Home-Based Approach
The use of broadcasting and forwarding pointers imposes scalability
problems and broadcasting or multicasting is difficult to implement
efficiently in largescale networks
long chains of forwarding pointers introduce performance problems
and are susceptible to broken links
Intro to Distributed System
A popular approach to supporting mobile entities in large-scale
networks is to introduce a home location, which keeps track of the
current location of an entity
An entity’s home address is registered at a naming service – The
home registers the foreign address of the entity – Clients always
contact the home first, and then continues with the foreign location
20
Type of Naming System
Flat Naming - Home-Based Approach
1. Update home node about the
foreign address
Mobile entity
Intro to Distributed System
3a. Home node forwards the message
Home node
to the foreign address of the mobile 2. Client sends the packet to the
entity mobile entity at its home node
3b. Home node replies the client with
the current IP address of the mobile
entity
4. Client directly sends all subsequent
packets directly to the foreign address of
the mobile entity
21
Type of Naming System
Structured Naming
Flat names are good for machines, but are generally not very
convenient for humans to use (difficult for humans to remember)
Structured names are composed of simple, human-readable names
Names are arranged in a specific structure
Examples:
Intro to Distributed System
file system naming and host naming on the Internet follow this
approach
File-systems utilize structured names to identify files
/home/userid/work/dist-systems/naming.txt
Websites can be accessed through structured names
www.diet.engg.cse.ce
Names in DS are organized into Name Spaces
22
Type of Naming System
Structured Naming - Name Spaces
Naming Space for structured names can be represented as a labeled,
directed graph with two types of nodes
Leaf nodes: represents a named entity and has no outgoing edges
stores information on the entity it is representing
E.g.: address of an entity (e.g., in DNS) so that a client can access it, or
Intro to Distributed System
state of (or the path to) an entity (e.g., in file systems)
Directory nodes: is an entity that refers to other nodes
stores a table in which an outgoing edge is represented as a pair (edge
label, node identifier) - such a table is called a directory table
Each path in a naming graph can be referred to by the sequence of
labels (path name) corresponding to the edges in that path, such as
N:<label-1, label-2, ..., label-n>
where N refers to the first node in the path 23
Type of Naming System
Structured Naming - Name Spaces (Example)
Path name (sequence of edge labels leading from one node to
another) is used to refer to a node in the graph
Absolute path: name starts from root node (e.g., /home/steen/mbox)
Relative path: name does not start at root node (e.g., steen/mbox)
Ro
ot
No
de
Intro to Distributed System
Looking up for entity
with name
“/home/steen/mbox”
24
Type of Naming System
Structured Naming - Name Resolution
Name spaces offer a convenient mechanism for storing and
retrieving information about entities by means of names.
More generally, given a path name, it should be possible to look up
any information stored in the node referred to by that name
The process of looking up a name is called name resolution
Intro to Distributed System
Given a path name, it should be possible to look up any information
stored in the node referred to by that name
Knowing how and where to start name resolution is generally
referred to as closure mechanism
Closure mechanism:
Name resolution cannot be accomplished without an initial directory
node
It selects the implicit context from which to start name resolution 25
Type of Naming System
Structured Naming - Name Resolution
To know how name resolution works, let us consider a path
name such as N:<label1, label2, ..., labeln>
Resolution of this name
starts at node N of the naming graph, where name label1 is looked up in
the directory table, and which returns the identifier of the node to which
Intro to Distributed System
label1 refers
then continues at the identified node by looking up the name label2 in
its directory table, and so on.
Assuming that the named path actually exists, resolution stops at the
last node referred to by labeln, by returning the content of that node
A name lookup returns the identifier of a node from where the
name resolution process continues
it is necessary to access the directory table of the identified node 26
Type of Naming System
Attributed-Based Naming
Flat and structured names generally provide a unique and location-
independent way of referring to entities
Moreover, structured names have been partly designed to provide a
human-friendly way to name entities so that they can be conveniently
accessed
Intro to Distributed System
In most cases, it is assumed that the name refers to only a single
entity
However, location independence and human friendliness are not the
only criterion for naming entities.
As more information is being made available it becomes important to
effectively search for entities
Users must be able to search the entity by merely providing any
description about the entity.
27
Type of Naming System
Attributed-Based Naming
There are many ways in which descriptions can be provided, but
a popular one in DSs is to describe an entity in terms of
(attribute, value) pairs
In this approach, an entity is assumed to have an associated
collection of attributes
Intro to Distributed System
Each attribute says something about that entity
By specifying which values a specific attribute should have, a
user essentially constrains the set of entities that he is interested
in
Attribute-based naming systems
Directory Services
Hierarchical Implementations: LDAP 28
Desirable Features of A Good Naming System
1. Location Transparency: means that name of an object
should not disclose/reveal any hint as to the physical location
of the object
That is, an object's name should be independent of the physical
connectivity or topology of the system, or the current location of
Intro to Distributed System
the object
2. Location Independency: For performance, reliability,
availability, and security reasons, DSs provide the facility of
object migration that allows the movement & relocation of
objects dynamically among the various nodes of a system
Location independency means that the name of an object need
not be changed when the object’s location changes 29
Desirable Features of A Good Naming System
2. Location Independency:
Therefore, the requirement of location independency calls
for a global naming facility with the following two features
An object at any node can be accessed without the
knowledge of its physical location (location
Intro to Distributed System
independency of request-receiving objects)
An object at any node can issue an access request
without the knowledge of its own physical location
(location independency of request-issuing objects)
This property is known as user mobility
30
Desirable Features of A Good Naming System
3. Multiple User-Defined Names for the Same Object: For a shared
object, it is desirable that different users of the object can use their
own convenient names for accessing it.
Therefore, a naming system must provide the flexibility to assign multiple
user-defined names to the same object.
In this case, it should be possible for a user to change or delete his or her
name for the object without affecting those of other users.
Intro to Distributed System
4. Scalability: DSs vary in size ranging from one with a few nodes to
one with many nodes
Moreover, DSs are normally open systems, and their size changes
dynamically
Therefore, it is impossible to have an a priori idea about how large the set
of names to be dealt with is liable to get.
Hence, a naming system must be capable of adapting to the dynamically
changing scale of a DS that normally leads to a change in the size of the
name space. 31
Desirable Features of A Good Naming System
5. Uniform Naming Convention: In many existing systems,
different ways of naming objects, called naming conventions, are
used for naming different types of objects.
For example, file names typically differ from user names and process
names and instead of using such nonuniform naming conventions, a
good naming system should use the same naming convention for all
Intro to Distributed System
types of objects in the system.
6. Group Naming: naming system should allow many different
objects to be identified by the same name.
Such a facility is useful to support broadcast facility or to group
objects for conferencing or other applications.
32
Desirable Features of A Good Naming System
7. Meaningful Names: A name can be simply any character
string identifying some object.
However, for users, meaningful names are preferred to lower
level identifiers such as memory pointers, disk block numbers,
or network addresses.
Intro to Distributed System
This is because meaningful names typically indicate something
about the contents or function of their referents, are easily
transmitted between users, and are easy to remember and use.
Therefore, a good naming system should support at least two
level of object identifiers, one convenient for human users and
one convenient for machines
33
Desirable Features of A Good Naming System
8. Performance: The most important performance
measurement of a naming system is the amount of time
needed to map an object's name to its attributes, such as its
location
In a distributed environment, this performance is dominated
Intro to Distributed System
by the number of messages exchanged during the name-
mapping operation.
Therefore, a naming system should be efficient in the sense
that the number of messages exchanged in a name-mapping
operation should be as small as possible.
34
Desirable Features of A Good Naming System
9. Fault Tolerance: naming system should be capable of
tolerating, to some extent, faults that occur due to the failure of a
node or a communication link in a DS network
That is, the naming system should continue functioning, perhaps in a
degraded form, in the event of these failures.
The degradation can be in performance. functionality, or both but
Intro to Distributed System
should be proportional, in some sense, to the failures causing it.
10. Replication Transparency: In a DS, replicas of an object are
generally created to improve performance and reliability.
A naming system should support the use of multiple copies of the
same object in a user-transparent manner
That is, if not necessary, a user should not be aware that multiple
copies of an object are in use 35
Desirable Features of A Good Naming System
10. Replication Transparency:
Intro to Distributed System
36
Desirable Features of A Good Naming System
11. Locating the Nearest Replica. When a naming system
supports the use of multiple copies of the same object, it is
important that the object-locating mechanism of the naming
system should always supply the location of the nearest replica of
the desired object
Intro to Distributed System
This is because the efficiency of the object accessing operation will
be affected if the object-locating mechanism does not take this point
into consideration
This is illustrated by the example given below, where the desired
object is replicated at nodes N1, N2, and N3 and the object-locating
mechanism is such that it maps to the replica at node N3 instead of
the nearest replica at node N1.
Obviously this is undesirable. 37
Desirable Features of A Good Naming System
11. Locating the Nearest Replica
Intro to Distributed System
38
The End!
Questions, Ambiguities,
Intro to Distributed System
Doubts, … ???
39