L6 DFS

CSE352: Parallel and Distributed Systems
Distributed File Systems (DFS)

Learning objectives
 Understand the requirements that affect the design of

distributed services
 Understand how a relatively simple, widely-used service
(NFS) is designed
 Obtain a knowledge of file systems, both local and
networked
 Caching as an essential design technique
 Security requires special consideration
 Recent advances in Distributed File Systems
2
Introduction
 Why do we need a DFS?

 Primary purpose of a Distributed System…

Connecting users and resources

 Resources…
 … can be inherently distributed
 … can actually be data (files, databases, …) and…
 … their availability becomes a crucial issue for the
performance of a Distributed System
3
Introduction
 A Case for DS

Server A
I need to
I want to store store my
my thesis on analysis
the server! and reports
I need to have I need safely…
my book always storage for My boss
available.. my reports wants…
4
Introduction
 A Case for DS

Server A
Same here…
Server B I don’t
remember..
Server C Hey… but

where did I
put my docs?
Wow… now I can

store a lot more
documents…
I am not sure whether
server A, or B, or C…
5
Introduction
 A Case for DS

Server B
Server A
Server C
Distributed File System

Nice… my Good… I can access
boss will Wow! I do not have my folders from
promote me! to remember which anywhere..
server I stored the
data into…
6
Storage Systems and Their Properties
 In first generation of distributed 1974 - 1995

systems (1974-95), file systems (e.g.
NFS) were the only networked storage
systems.
1995 - 2007
 With the advent of distributed object
systems (CORBA, Java) and the web,
the picture has become more complex.
2007 - now
 Current focus is on large scale, scalable
storage.
 Google File System
 Amazon S3 (Simple Storage Service)
7
What is a File System?
 Persistent stored data sets

 Hierarchic name space visible to all processes
 API with the following characteristics:
 access and update operations on persistently stored data sets
 Sequential access model (with additional random facilities)
 Sharing of data between users, with access control

 Concurrent access:
 certainly for read-only access
 what about updates?
 Other features:
 mountable file stores
 more? ...
8
UNIX file system operations
filedes = open(name, mode) Opens an existing file with the given name .
filedes = creat(name, mode) Creates a new file with the given name .
Both operations deliver a file descriptor referencing the open
file. The mode is read, write or both.
status = close(filedes) Closes the open file filedes.
count = read(filedes, buffer, n) Transfers n bytes from the file referenced by filedes to buffer
.
count = write(filedes, buffer, n) Transfers n bytes to the file referenced by filedes from buffer.
Both operations deliver the number of bytes actually transferred
and advance the read-write pointer.
pos = lseek(filedes, offset,
Moves the read-write pointer to offset (relative or absolute,
whence) depending on whence).
status = unlink(name) Removes the file name from the directory structure. If the file
has no other names, it is deleted.
status = link(name1, name2) Adds a new name (name2) for a file (name1 ).
status = stat(name, buffer) Gets the file attributes for file name into buffer.
9
File attribute record structure

File length
Creation timestamp
updated
Read timestamp
by system:
Write timestamp
Attribute timestamp
Reference count
Owner
File type
updated
by owner: Access control list
E.g. for UNIX: rw-rw-r--
10
File Service Architecture
 An architecture that offers a clear separation of the

main concerns in providing access to files is
obtained by structuring the file service as three
components:
 mawgood fel file system (centralized or distributed) A flat
file service
Deal with blocks of the file system in flat way.
 A directory service
Deal with files in heighracial way as folders.

 A client module.
 The Client module implements exported interfaces

by flat file and directory services on server side. 11
Model File Service Architecture
Lookup
AddName
UnName
Client computer Server computer
GetNames
Application Application Directory service

program program
Flat file service
Client module
Read
Write
Create
Delete
GetAttributes
SetAttributes
12
Responsibilities of Various Modules
 Flat file Service
 Concerned with the implementation of operations on the contents of file. Unique
File Identifiers (UFIDs) are used to refer to files in all requests for flat file service
operations. UFIDs are long sequences of bits chosen so that each file has a
unique ID among all of the files in a distributed system.
 Directory Service
 Provides mapping between text names for the files and their UFIDs. Clients may
obtain the UFID of a file by quoting its text name to directory service. Directory
service supports functions needed to generate directories and to add new files
to directories.
 Client Module
 It runs on each computer and provides integrated service (flat file and directory)
as a single API to application programs. For example, in UNIX hosts, a client
module emulates the full set of UNIX file operations.
 It holds information about the network locations of flat-file and directory server
processes; and achieve better performance through implementation of a cache
of recently used file blocks at the client.
13
Server operations/interfaces for the model
file service
Flat file service position of first byte Directory service
Read(FileId, i, n) -> Data Lookup(Dir, Name) -> FileId

position of first byte
Write(FileId, i, Data) AddName(Dir, Name, File)
Create() -> FileId UnName(Dir, Name)
Delete(FileId)

GetNames(Dir, Pattern) ->
GetAttributes(FileId) -> Attr
NameSeq
FileId
SetAttributes(FileId, Attr)
A unique identifier for files anywhere in the
network.
14
File Group
A collection of files that can be

located on any server or moved To construct a globally unique
between servers while ID we use some unique
maintaining the same names. attribute of the machine on
which it is created, e.g. IP
 Similar to a UNIX filesystem address, even though the file
 Helps with distributing the load of group may move subsequently.
file serving between several
servers.
 File groups have identifiers which
File Group ID:
are unique throughout the system
32 bits 16 bits
(and hence for an open system,
they must be globally unique). IP address date
 Used to refer to file groups and
files
15
Case Study: Sun (Oracle) NFS
 An industry standard for file sharing on local networks since the 1980s
 An open standard with clear and simple interfaces
 Supports many of the design requirements already mentioned:
 transparency
 heterogeneity
 efficiency
 fault tolerance
 Limited achievement of:
 concurrency
 replication
 consistency
 security
16
NFS Architecture
Client computer
Client computer NFS Server computer

Application
program Client
Application Application Application

Kernel program
program program
UNIX
system calls
UNIX kernel Virtual file system Virtual file system

Operations Operations
on local files on
file system
remote files
UNIX NFS UNIX
NFS NFS Client
file file
Other
client server
system system
NFS
protocol
(remote operations) 17
NFS Architecture
Does the implementation have to be in the system kernel?

No:
 there are examples of NFS clients and servers that run at
application-level as libraries or processes (e.g. early Windows and
MacOS implementations)
But, for a UNIX implementation there are advantages:
 Binary code compatible - no need to recompile applications
 Standard system calls that access remote files can be routed
through the NFS client module by the kernel

 Shared cache of recently-used blocks at client
 Kernel-level server can access i-nodes and file blocks directly
 Security of the encryption key used for authentication
18
NFS Server Operations
• read(fh, offset, count) -> attr, data fh = fileModel flat file service
handle:
• write(fh, offset, count, data) -> attr Read(FileId, i, n) -> Data
Filesystem identifier i-node number i-node generation
•
create(dirfh, name, attr) -> newfh, attr Write(FileId, i, Data)
• remove(dirfh, name) status Create() -> FileId
• getattr(fh) -> attr Delete(FileId)
• setattr(fh, attr) -> attr GetAttributes(FileId) -> Attr
• lookup(dirfh, name) -> fh, attr SetAttributes(FileId, Attr)
• rename(dirfh, name, todirfh, toname) Model directory service
• link(newdirfh, newname, dirfh, name) Lookup(Dir, Name) -> FileId
• readdir(dirfh, cookie, count) -> entries AddName(Dir, Name, File)
• symlink(newdirfh, newname, string) -> UnName(Dir, Name)
status GetNames(Dir, Pattern)
• readlink(fh) -> string ->NameSeq
• mkdir(dirfh, name, attr) -> newfh, attr
• rmdir(dirfh, name) -> status
• statfs(fh) -> fsstats 19
NFS Access Control and Authentication
 Stateless server, so the user's identity and access rights must

be checked by the server on each request.
 Every client request is accompanied by the userID and groupID
 Kerberos has been integrated with NFS to provide a stronger
and more comprehensive security solution
20
Architecture Components
(UNIX / Linux)
 Server:
 nfsd: NFS server daemon that services requests from
clients.
 mountd: NFS mount daemon that carries out the mount
request passed on by nfsd.

 rpcbind: RPC port mapper used to locate the nfsd
daemon.
 /etc/exports: configuration file that defines which portion
of the file systems are exported through NFS and how.

 Client:
 mount: standard file system mount command.
 /etc/fstab: file system table file.
 nfsiod: (optional) local asynchronous NFS I/O server.

21
Mount Service
 Mount operation:
mount(remotehost, remotedirectory,
localdirectory)
 Server maintains a table of clients who have
mounted filesystems at that server
 Each client maintains a table of mounted file
systems holding:
< IP address, port number, file handle>
22
Local and Remote File Systems Accessible on an
NFS Client
Server 1 Client Server 2
(root) (root) (root)
export ... vmunix usr nfs
Remote Remote
people students x staff users
mount mount
big jon bob ... jim ann jane joe
23
Automounter
NFS client catches attempts to access 'empty' mount points

and routes them to the Automounter
 Automounter has a table of mount points and multiple
candidate servers for each

 it sends a probe message to each candidate server and
then uses the mount service to mount the filesystem at

the first server to respond
 Keeps the mount table small
 Provides a simple form of replication for read-only

filesystems
 E.g. if there are several servers with identical copies of
/usr/lib then each server will have a chance of being

mounted at some clients.
24
NFS Optimization - Server Caching
 Similar to UNIX file caching for local files

 pages (blocks) from disk are held in a main memory buffer cache until the space
is required for newer pages. Read-ahead and delayed-write optimizations.
 For local files, writes are deferred to next sync event (30 second intervals)
 Works well in local context, where files are always accessed through the local
cache, but in the remote case it doesn't offer necessary synchronization
guarantees to clients.
 NFS v3 servers offers two strategies for updating the disk
 write-through - altered pages are written to disk as soon as they are received at
the server.
 delayed commit - pages are held only in the cache until a commit() call is
received for the relevant file. This is the default mode used by NFS v3 clients. A
commit() is issued by the client whenever a file is closed.
25
NFS Optimization - Client Caching
 Server caching does nothing to reduce the traffic between

client and server
 further optimization is essential to reduce server load in large networks
 NFS client module caches the results of read, write, getattr, lookup and readdir
operations
 synchronization of file contents (one-copy semantics) is not guaranteed when
two or more clients are sharing the same file.
 Timestamp-based validity check
 reduces inconsistency, but doesn't eliminate it
 validity condition for cache entries at the client: t freshness guarantee
Tc time when cache entry was
(T - Tc < t) v (Tmclient = Tmserver) last validated
 t is configurable (per file) but is typically set to Tm time when block was last
3 seconds for files and 30 secs. for directories updated
 it remains difficult to write distributed
T current time
applications that share files with NFS
26
Summary
 Distributed File systems provide illusion of a local file system

and hide complexity from end users.
 Sun NFS is an excellent example of a distributed service
designed to meet many important design requirements
 Effective client caching can produce file service performance
equal to or better than local file systems
 Consistency versus update semantics versus fault tolerance
remains an issue
 Most client and server failures can be masked
27

L6 DFS

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

L6 DFS

Uploaded by

Copyright:

Available Formats

CSE352: Parallel and Distributed Systems

Distributed File Systems (DFS)

 Understand the requirements that affect the design of

 Security requires special consideration

 Recent advances in Distributed File Systems

 Why do we need a DFS?

Server C Hey… but

Wow… now I can

Distributed File System

 In ﬁrst generation of distributed 1974 - 1995

 Persistent stored data sets

 Sequential access model (with additional random facilities)

 Sharing of data between users, with access control

 what about updates?

UNIX ﬁle system operations

File attribute record structure

E.g. for UNIX: rw-rw-r--

 An architecture that offers a clear separation of the

Deal with ﬁles in heighracial way as folders.

 The Client module implements exported interfaces

Application Application Directory service

Flat ﬁle service

Read(FileId, i, n) -> Data Lookup(Dir, Name) -> FileId

A collection of ﬁles that can be

 Used to refer to ﬁle groups and

Client computer NFS Server computer

Application Application Application

UNIX kernel Virtual ﬁle system Virtual ﬁle system

Does the implementation have to be in the system kernel?

through the NFS client module by the kernel

 Stateless server, so the user's identity and access rights must

request passed on by nfsd.

of the ﬁle systems are exported through NFS and how.

 /etc/fstab: ﬁle system table ﬁle.

 nfsiod: (optional) local asynchronous NFS I/O server.

Server 1 Client Server 2

(root) (root) (root)

export ... vmunix usr nfs

big jon bob ... jim ann jane joe

NFS client catches attempts to access 'empty' mount points

candidate servers for each

then uses the mount service to mount the ﬁlesystem at

 Provides a simple form of replication for read-only

/usr/lib then each server will have a chance of being

 Similar to UNIX ﬁle caching for local ﬁles

 Server caching does nothing to reduce the traﬃc between

 Distributed File systems provide illusion of a local ﬁle system

You might also like