0% found this document useful (0 votes)
34 views39 pages

Naming Systems in Distributed Systems

1) Naming plays an important role in computer systems by uniquely identifying entities and allowing access to shared resources. 2) In distributed systems, the implementation of a naming system is distributed across multiple machines and how this distribution is done impacts efficiency and scalability. 3) A naming system maps names to addresses, allowing processes to access named entities through name resolution. It provides an abstraction that hides the details of an object's location.

Uploaded by

Addisu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views39 pages

Naming Systems in Distributed Systems

1) Naming plays an important role in computer systems by uniquely identifying entities and allowing access to shared resources. 2) In distributed systems, the implementation of a naming system is distributed across multiple machines and how this distribution is done impacts efficiency and scalability. 3) A naming system maps names to addresses, allowing processes to access named entities through name resolution. It provides an abstraction that hides the details of an object's location.

Uploaded by

Addisu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

Arba Minch University

Institute of Technology
Faculty of Computing & Software Engineering

Intro to Distributed

Intro to Distributed System


System
(CoSec4038)
Mr. Addisu M. (Asst. Prof) G3 SE - Regular
1
Chapter 5

Naming System

Intro to Distributed System


in DS
2
Introduction
 Names play an important role in all computer systems
 are used to share resources, uniquely identify entities, refer to
locations, and more information about the given entities
 An important issue with naming is that a name can be resolved
to the entity it refers to.
 Name resolution thus allows a process to access the named

Intro to Distributed System


entity.
 To resolve names, it is necessary to implement a naming system
 In DS, the implementation of a naming system is itself often
distributed across multiple machines, unlike in a non DSs
 How this distribution is done plays a key role in the efficiency
and scalability of the naming system
3
Some Terminology
 Name - a string of bits/characters used to identify/refer to
entities, locations, resources & more in a context
 Simply comparing two names, we might not be able to know if
they refer to the same entity
 Can be human-friendly (or not) & location dependent (or not)
• File names, www.amu.edu.et, variable names etc. are human-

Intro to Distributed System


friendly names given to each entity
 Textual names (human readable)
 used to identify individual services, people or
 email address: Hans.Mair@inf.unibz.it
 URL: www.google.com
 groups of people or objects
 mailing lists: professors@unibz.it
 mail domains (if there are several mail exchangers) 4
Some Terminology
 Name - a string of bits/characters used to identify/refer to
entities, locations, resources & more in a context
 Numeric addresses (identify the location of an object)
 locate individual resources, e.g.
193.206.186.100 (IP host address)
 special case: group addresses, e.g.

Intro to Distributed System


multicast and broadcast addresses: IP Multicast, Ethernet
• Question: how to map/resolve these names to addresses so that we
can access the entities on which we want to operate?
• Solution: have a naming system that maintains name-to-
address binding!
 Purpose: to identify an entity and act as an access point to it
5
Some Terminology
 Entity is anything that has a particular task
• E.g., resources such as hosts, printers, disks, files,
processes, users, mailboxes, web pages, messages, web
pages, network connections, etc
• Entities can be operated on. For example:
• a resource such as a printer offers an interface containing

Intro to Distributed System


operations for printing a document, requesting the status of a
print job, and the like
• an entity such as a network connection may provide
operations for sending and receiving data, setting quality-of-
service parameters, requesting the status, and so forth
• To operate on an entity, it is necessary to access it through its
Access Point
• To use printer, you first access it – cable to connect with PC 6
Some Terminology
 Access point - a special kind of entity to access an Entity and its
name is called address (address of entity, e.g., IP, port #, phone #)
• Address of an access point of an entity is also simply called an
address of that entity
• An entity may have more than one access point/address (similar to
accessing an individual through different telephone numbers)
• An entity may change its access points/addresses over time

Intro to Distributed System


• Example: telephone numbers, e-mail addresses, computer move from one to another-
diff IP address, ...
• Using an address as a reference is inflexible and human unfriendly
• A better approach is to use a name that is location independent,
much easier, and flexible to use

7
Some Terminology
 Name vs Address
 Name: an identifier permanently associated with an object,
independent of its location within the DS
 A name is how an endpoint is referenced.
 Typically, no structurally significant hierarchy
 “Desta”, “Hawasa”, “Banana”

Intro to Distributed System


 Address: an identifier associated with the current location
of the object
 An address is how you get to an endpoint
 Typically, hierarchical (for scaling):
 Arba Minch University, Arba Minch City, Ethiopia
 10.144.5.30, +251-928 364289,
www.abebe.kebede@amu.edu.et 8
Some Terminology
 Example
 Telephone as Access Point to a person
 Telephone – access point, telephone number – address of
person, person (entity) can have several telephone numbers
(addresses)
 In a DS, a host running a specific server - an access point,

Intro to Distributed System


with its address formed by the combination of – an IP
address and port number
 In general, don’t want to use an address as a regular name –
coz it changes! So we want location-independent name!

9
Some Terminology
 An address is thus just a special kind of name: it refers to an access
point of an entity.
• Because an access point is tightly associated with an entity, it would seem
convenient to use the address of an access point as a regular name for
the associated entity
 Nevertheless, this is hardly ever done as such naming is
generally very inflexible and often human unfriendly

Intro to Distributed System


• For example, it is not uncommon to regularly reorganize a distributed system,
so that a specific server is now running on a different host than previously
 An entity may easily change an access point, or an access point may
be reassigned to a different entity
 If an address is used to refer to an entity, we will have an invalid
reference the instant the access point changes or is reassigned to
another entity
10
Some Terminology
 Therefore, it is much better to let a service be known by a
separate name independent of the address of the associated
server
 Likewise, if an entity offers more than one access point, it is not
clear which address to use as a reference.
 For instance, many organizations distribute their Web service

Intro to Distributed System


across several servers
 If we would use the addresses of those servers as a reference for
the Web service, it is not obvious which address should be
chosen as the best one
 A much better solution is to have single name for Web
service independent from the addresses of different Web servers
 a name for an entity that is independent from its addresses is
often much easier and more flexible to use 11
Some Terminology
 Identifier - a special name that uniquely identifies an entity
 refers to only one entity
 True identifier has the following three properties:
• P1: Each identifier refers to at most one entity
• P2: Each entity is referred to by at most one identifier
• P3: An Identifier always refers to the same entity (no reuse)

Intro to Distributed System


 Address: is name of an access point, location of an entity
 E.g., phone number, or IP-address + port for a service
 Addresses and identifiers are important and used for
different purposes, but they are often represented in
machine readable format (MAC, memory address)
12
Naming System
 An important issue with naming is that a name can be
resolved to the Entity it Refers to and Name Resolution thus
allows a process to access the named entity
 Naming facility of a distributed OS enables users and
programs to assign character-string names to objects and

Intro to Distributed System


subsequently use these names to refer to those objects
 Locating facility, which is an integral part of the naming
facility, maps an object’s name to the object’s location
 Naming and locating facilities jointly form a naming system
that provides the users with an abstraction of an object hides
the details of how and where an object is actually located in
the network 13
Naming System
 Naming system is a middleware that assists in name
resolution
• It is one of the most important components of a distributed
OS because it enables objects to be identified and accessed
in a uniform manner
• In principle, a naming system maintains a name-to-address

Intro to Distributed System


binding resolve name (or identifier) to address which in its
simplest form is just a table of (name, address) pairs
 are classified into three classes based on type of names used:
 Flat/Unstructured naming
 Structured naming
 Attribute-based naming
14
Type of Naming System
Flat Naming
 Identifiers are convenient to uniquely represent entities.
 In many cases, identifiers are simply random bit strings (which
we conveniently refer to as unstructured/flat names)
 Property of flat name – it does not contain any information
whatsoever on how to locate access point of its associated entity

Intro to Distributed System


 Good for machines
 Example: Internet Address at the Network layer
 How to resolve flat names, or, equivalently, how we can
locate an entity when given only its identifier.?
 Simple Solutions(Broadcasting and Forwarding Pointers)
 Home-Based Solutions.
 Distributed Hash Table (Reading)
 Hierarchical Approach 15
Type of Naming System
Flat Naming - Broadcasting
 Consider a DS built on a computer network: that offers efficient
broadcasting facilities
 Approach:
 Broadcast the name/address to the whole network; the entity
associated with the name responds with its current identifier

Intro to Distributed System


 It requires all processes to listen to incoming location requests
 such facilities are offered by local-area networks in which all
machines are connected to a single cable
 Locating an entity is simple
 message containing the identifier of the entity is broadcast to each machine
and each machine is requested to check whether it has that entity
 Only the machines that can offer an access point for the entity send a
reply message containing the address of that access point 16
Type of Naming System
Flat Naming - Broadcasting
 Broadcasting becomes inefficient when the network grows
 network bandwidth wasted by request messages
 too many hosts maybe interrupted by requests they cannot answer
 Example: Address Resolution Protocol (ARP)
 Resolve an IP address to a MAC address

Intro to Distributed System


 Identifies MAC address by broadcasting IP address to all the nodes
 For which that IP is related, Who has the address
they reply with their MAC address 192.168.0.1?
x x
 Challenges:
 Not scalable in large networks
 Requires all entities to listen to all requests
I am 192.168.0.1. My identifier is
 Solution: switch to multicasting, by which only a 02:AB:4A:3C:59:85

restricted group of hosts receives the request 17


Type of Naming System
Flat Naming - Forwarding Pointers
 Another popular approach to locating mobile entities is to make
use of forwarding pointers and the principle is simple: when an
entity moves from A to B, it leaves behind in A a reference to its
new location at B

Intro to Distributed System


 The main advantage of this approach is its simplicity: as soon as
an entity has been located, for example by using a traditional
naming service, a client can look up the current address by
following the chain of forwarding pointers
 Example: postal service
 One person moves from Arba Minch  Hawassa  Addis
Ababa  Adama  ……. 18
Type of Naming System
Flat Naming - Forwarding Pointers
 Each forwarding pointer is implemented as a pair:
(client stub, server stub)
 server stub contains a local reference to the actual object or a
local reference to another client stub
 When object moves from A (e.g., P2) to B (e.g., P3),

Intro to Distributed System


 It leaves a client stub at A (i.e., P2)
 It installs a server stub at B (i.e., P3)

Process P1 Process P2

Process P3
Process P4

n = Process n; = Remote Object; = Caller Object; = Server stub; = Client stub


19
Type of Naming System
Flat Naming - Home-Based Approach
 The use of broadcasting and forwarding pointers imposes scalability
problems and broadcasting or multicasting is difficult to implement
efficiently in largescale networks
 long chains of forwarding pointers introduce performance problems
and are susceptible to broken links

Intro to Distributed System


 A popular approach to supporting mobile entities in large-scale
networks is to introduce a home location, which keeps track of the
current location of an entity
 An entity’s home address is registered at a naming service – The
home registers the foreign address of the entity – Clients always
contact the home first, and then continues with the foreign location
20
Type of Naming System
Flat Naming - Home-Based Approach

1. Update home node about the


foreign address

Mobile entity

Intro to Distributed System


3a. Home node forwards the message
Home node
to the foreign address of the mobile 2. Client sends the packet to the
entity mobile entity at its home node

3b. Home node replies the client with


the current IP address of the mobile
entity

4. Client directly sends all subsequent


packets directly to the foreign address of
the mobile entity
21
Type of Naming System
Structured Naming
 Flat names are good for machines, but are generally not very
convenient for humans to use (difficult for humans to remember)
 Structured names are composed of simple, human-readable names
 Names are arranged in a specific structure
 Examples:

Intro to Distributed System


 file system naming and host naming on the Internet follow this
approach
 File-systems utilize structured names to identify files
 /home/userid/work/dist-systems/naming.txt
 Websites can be accessed through structured names
 www.diet.engg.cse.ce
 Names in DS are organized into Name Spaces
22
Type of Naming System
Structured Naming - Name Spaces
 Naming Space for structured names can be represented as a labeled,
directed graph with two types of nodes
 Leaf nodes: represents a named entity and has no outgoing edges
 stores information on the entity it is representing
E.g.: address of an entity (e.g., in DNS) so that a client can access it, or

Intro to Distributed System


state of (or the path to) an entity (e.g., in file systems)
 Directory nodes: is an entity that refers to other nodes
 stores a table in which an outgoing edge is represented as a pair (edge
label, node identifier) - such a table is called a directory table
 Each path in a naming graph can be referred to by the sequence of
labels (path name) corresponding to the edges in that path, such as
N:<label-1, label-2, ..., label-n>
 where N refers to the first node in the path 23
Type of Naming System
Structured Naming - Name Spaces (Example)
 Path name (sequence of edge labels leading from one node to
another) is used to refer to a node in the graph
 Absolute path: name starts from root node (e.g., /home/steen/mbox)
 Relative path: name does not start at root node (e.g., steen/mbox)
Ro
ot
No
de

Intro to Distributed System


Looking up for entity
with name
“/home/steen/mbox”

24
Type of Naming System
Structured Naming - Name Resolution
 Name spaces offer a convenient mechanism for storing and
retrieving information about entities by means of names.
 More generally, given a path name, it should be possible to look up
any information stored in the node referred to by that name
 The process of looking up a name is called name resolution

Intro to Distributed System


 Given a path name, it should be possible to look up any information
stored in the node referred to by that name
 Knowing how and where to start name resolution is generally
referred to as closure mechanism
 Closure mechanism:
 Name resolution cannot be accomplished without an initial directory
node
 It selects the implicit context from which to start name resolution 25
Type of Naming System
Structured Naming - Name Resolution
 To know how name resolution works, let us consider a path
name such as N:<label1, label2, ..., labeln>
 Resolution of this name
 starts at node N of the naming graph, where name label1 is looked up in
the directory table, and which returns the identifier of the node to which

Intro to Distributed System


label1 refers
 then continues at the identified node by looking up the name label2 in
its directory table, and so on.
 Assuming that the named path actually exists, resolution stops at the
last node referred to by labeln, by returning the content of that node
 A name lookup returns the identifier of a node from where the
name resolution process continues
 it is necessary to access the directory table of the identified node 26
Type of Naming System
Attributed-Based Naming
 Flat and structured names generally provide a unique and location-
independent way of referring to entities
 Moreover, structured names have been partly designed to provide a
human-friendly way to name entities so that they can be conveniently
accessed

Intro to Distributed System


 In most cases, it is assumed that the name refers to only a single
entity
 However, location independence and human friendliness are not the
only criterion for naming entities.
 As more information is being made available it becomes important to
effectively search for entities
 Users must be able to search the entity by merely providing any
description about the entity.
27
Type of Naming System
Attributed-Based Naming
 There are many ways in which descriptions can be provided, but
a popular one in DSs is to describe an entity in terms of
(attribute, value) pairs
 In this approach, an entity is assumed to have an associated
collection of attributes

Intro to Distributed System


 Each attribute says something about that entity
 By specifying which values a specific attribute should have, a
user essentially constrains the set of entities that he is interested
in
 Attribute-based naming systems
 Directory Services
 Hierarchical Implementations: LDAP 28
Desirable Features of A Good Naming System
1. Location Transparency: means that name of an object
should not disclose/reveal any hint as to the physical location
of the object
 That is, an object's name should be independent of the physical
connectivity or topology of the system, or the current location of

Intro to Distributed System


the object
2. Location Independency: For performance, reliability,
availability, and security reasons, DSs provide the facility of
object migration that allows the movement & relocation of
objects dynamically among the various nodes of a system
 Location independency means that the name of an object need
not be changed when the object’s location changes 29
Desirable Features of A Good Naming System
2. Location Independency:
 Therefore, the requirement of location independency calls
for a global naming facility with the following two features
 An object at any node can be accessed without the
knowledge of its physical location (location

Intro to Distributed System


independency of request-receiving objects)
 An object at any node can issue an access request
without the knowledge of its own physical location
(location independency of request-issuing objects)
 This property is known as user mobility
30
Desirable Features of A Good Naming System
3. Multiple User-Defined Names for the Same Object: For a shared
object, it is desirable that different users of the object can use their
own convenient names for accessing it.
 Therefore, a naming system must provide the flexibility to assign multiple
user-defined names to the same object.
 In this case, it should be possible for a user to change or delete his or her
name for the object without affecting those of other users.

Intro to Distributed System


4. Scalability: DSs vary in size ranging from one with a few nodes to
one with many nodes
 Moreover, DSs are normally open systems, and their size changes
dynamically
 Therefore, it is impossible to have an a priori idea about how large the set
of names to be dealt with is liable to get.
 Hence, a naming system must be capable of adapting to the dynamically
changing scale of a DS that normally leads to a change in the size of the
name space. 31
Desirable Features of A Good Naming System
5. Uniform Naming Convention: In many existing systems,
different ways of naming objects, called naming conventions, are
used for naming different types of objects.
 For example, file names typically differ from user names and process
names and instead of using such nonuniform naming conventions, a
good naming system should use the same naming convention for all

Intro to Distributed System


types of objects in the system.

6. Group Naming: naming system should allow many different


objects to be identified by the same name.
 Such a facility is useful to support broadcast facility or to group
objects for conferencing or other applications.
32
Desirable Features of A Good Naming System
7. Meaningful Names: A name can be simply any character
string identifying some object.
 However, for users, meaningful names are preferred to lower
level identifiers such as memory pointers, disk block numbers,
or network addresses.

Intro to Distributed System


 This is because meaningful names typically indicate something
about the contents or function of their referents, are easily
transmitted between users, and are easy to remember and use.
 Therefore, a good naming system should support at least two
level of object identifiers, one convenient for human users and
one convenient for machines
33
Desirable Features of A Good Naming System
8. Performance: The most important performance
measurement of a naming system is the amount of time
needed to map an object's name to its attributes, such as its
location
 In a distributed environment, this performance is dominated

Intro to Distributed System


by the number of messages exchanged during the name-
mapping operation.
 Therefore, a naming system should be efficient in the sense
that the number of messages exchanged in a name-mapping
operation should be as small as possible.
34
Desirable Features of A Good Naming System
9. Fault Tolerance: naming system should be capable of
tolerating, to some extent, faults that occur due to the failure of a
node or a communication link in a DS network
 That is, the naming system should continue functioning, perhaps in a
degraded form, in the event of these failures.
 The degradation can be in performance. functionality, or both but

Intro to Distributed System


should be proportional, in some sense, to the failures causing it.
10. Replication Transparency: In a DS, replicas of an object are
generally created to improve performance and reliability.
 A naming system should support the use of multiple copies of the
same object in a user-transparent manner
 That is, if not necessary, a user should not be aware that multiple
copies of an object are in use 35
Desirable Features of A Good Naming System
10. Replication Transparency:

Intro to Distributed System


36
Desirable Features of A Good Naming System
11. Locating the Nearest Replica. When a naming system
supports the use of multiple copies of the same object, it is
important that the object-locating mechanism of the naming
system should always supply the location of the nearest replica of
the desired object

Intro to Distributed System


 This is because the efficiency of the object accessing operation will
be affected if the object-locating mechanism does not take this point
into consideration
 This is illustrated by the example given below, where the desired
object is replicated at nodes N1, N2, and N3 and the object-locating
mechanism is such that it maps to the replica at node N3 instead of
the nearest replica at node N1.
 Obviously this is undesirable. 37
Desirable Features of A Good Naming System
11. Locating the Nearest Replica

Intro to Distributed System


38
The End!
Questions, Ambiguities,

Intro to Distributed System


Doubts, … ???
39

You might also like