You are on page 1of 27

CSE352: Parallel and Distributed Systems

Distributed File Systems (DFS)


Learning objectives

 Understand the requirements that affect the design of


distributed services
 Understand how a relatively simple, widely-used service
(NFS) is designed
 Obtain a knowledge of file systems, both local and

networked
 Caching as an essential design technique

 Security requires special consideration

 Recent advances in Distributed File Systems

2
Introduction

 Why do we need a DFS?


 Primary purpose of a Distributed System…

Connecting users and resources

 Resources…
 … can be inherently distributed
 … can actually be data (files, databases, …) and…
 … their availability becomes a crucial issue for the
performance of a Distributed System

3
Introduction

 A Case for DS

Server A

I need to
I want to store store my
my thesis on analysis
the server! and reports
I need to have I need safely…
my book always storage for My boss
available.. my reports wants…

4
Introduction

 A Case for DS

Server A
Same here…
Server B I don’t
remember..

Server C Hey… but


where did I
put my docs?

Wow… now I can


store a lot more
documents…
I am not sure whether
server A, or B, or C…

5
Introduction

 A Case for DS

Server B
Server A

Server C

Distributed File System


Nice… my Good… I can access
boss will Wow! I do not have my folders from
promote me! to remember which anywhere..
server I stored the
data into…

6
Storage Systems and Their Properties

 In first generation of distributed 1974 - 1995


systems (1974-95), file systems (e.g.
NFS) were the only networked storage
systems.
1995 - 2007
 With the advent of distributed object
systems (CORBA, Java) and the web,
the picture has become more complex.
2007 - now
 Current focus is on large scale, scalable
storage.
 Google File System
 Amazon S3 (Simple Storage Service)
7
What is a File System?

 Persistent stored data sets


 Hierarchic name space visible to all processes
 API with the following characteristics:
 access and update operations on persistently stored data sets

 Sequential access model (with additional random facilities)

 Sharing of data between users, with access control


 Concurrent access:
 certainly for read-only access

 what about updates?

 Other features:
 mountable file stores

 more? ...

8
What is a File System?

UNIX file system operations

filedes = open(name, mode) Opens an existing file with the given name .
filedes = creat(name, mode) Creates a new file with the given name .
Both operations deliver a file descriptor referencing the open
file. The mode is read, write or both.
status = close(filedes) Closes the open file filedes.
count = read(filedes, buffer, n) Transfers n bytes from the file referenced by filedes to buffer
.
count = write(filedes, buffer, n) Transfers n bytes to the file referenced by filedes from buffer.
Both operations deliver the number of bytes actually transferred
and advance the read-write pointer.
pos = lseek(filedes, offset,
Moves the read-write pointer to offset (relative or absolute,
whence) depending on whence).
status = unlink(name) Removes the file name from the directory structure. If the file
has no other names, it is deleted.
status = link(name1, name2) Adds a new name (name2) for a file (name1 ).
status = stat(name, buffer) Gets the file attributes for file name into buffer.

9
What is a File System?

File attribute record structure


File length
Creation timestamp
updated 
Read timestamp
by system:
Write timestamp
Attribute timestamp
Reference count
Owner
File type
updated 
by owner: Access control list

E.g. for UNIX: rw-rw-r--

10
File Service Architecture

 An architecture that offers a clear separation of the


main concerns in providing access to files is
obtained by structuring the file service as three
components:
 mawgood fel file system (centralized or distributed) A flat
file service
Deal with blocks of the file system in flat way.
 A directory service

Deal with files in heighracial way as folders.


 A client module.

 The Client module implements exported interfaces


by flat file and directory services on server side. 11
Model File Service Architecture
Lookup
AddName
UnName
Client computer Server computer
GetNames

Application Application Directory service


program program

Flat file service

Client module

Read
Write
Create
Delete
GetAttributes
SetAttributes
12
Responsibilities of Various Modules
 Flat file Service
 Concerned with the implementation of operations on the contents of file. Unique
File Identifiers (UFIDs) are used to refer to files in all requests for flat file service
operations. UFIDs are long sequences of bits chosen so that each file has a
unique ID among all of the files in a distributed system.
 Directory Service
 Provides mapping between text names for the files and their UFIDs. Clients may
obtain the UFID of a file by quoting its text name to directory service. Directory
service supports functions needed to generate directories and to add new files
to directories.
 Client Module
 It runs on each computer and provides integrated service (flat file and directory)
as a single API to application programs. For example, in UNIX hosts, a client
module emulates the full set of UNIX file operations.
 It holds information about the network locations of flat-file and directory server
processes; and achieve better performance through implementation of a cache
of recently used file blocks at the client.
13
Server operations/interfaces for the model
file service
Flat file service position of first byte Directory service

Read(FileId, i, n) -> Data Lookup(Dir, Name) -> FileId


position of first byte
Write(FileId, i, Data) AddName(Dir, Name, File)
Create() -> FileId UnName(Dir, Name)
Delete(FileId)

GetNames(Dir, Pattern) ->
GetAttributes(FileId) -> Attr
NameSeq
FileId
SetAttributes(FileId, Attr)
A unique identifier for files anywhere in the
network.

14
File Group

A collection of files that can be


located on any server or moved To construct a globally unique
between servers while ID we use some unique
maintaining the same names. attribute of the machine on
which it is created, e.g. IP
 Similar to a UNIX filesystem address, even though the file
 Helps with distributing the load of group may move subsequently.
file serving between several
servers.
 File groups have identifiers which
File Group ID:
are unique throughout the system
32 bits 16 bits
(and hence for an open system,
they must be globally unique). IP address date

 Used to refer to file groups and

files
15
Case Study: Sun (Oracle) NFS

 An industry standard for file sharing on local networks since the 1980s
 An open standard with clear and simple interfaces
 Supports many of the design requirements already mentioned:
 transparency
 heterogeneity
 efficiency
 fault tolerance
 Limited achievement of:
 concurrency
 replication
 consistency
 security

16
NFS Architecture
Client computer

Client computer NFS Server computer


Application
program Client

Application Application Application


Kernel program
program program
UNIX
system calls

UNIX kernel Virtual file system Virtual file system


Operations  Operations
on local files on
file system

remote files
UNIX NFS UNIX
NFS NFS Client
file file
Other

client server
system system

NFS
protocol 
(remote operations) 17
NFS Architecture

Does the implementation have to be in the system kernel?


No:
 there are examples of NFS clients and servers that run at
application-level as libraries or processes (e.g. early Windows and
MacOS implementations)
But, for a UNIX implementation there are advantages:
 Binary code compatible - no need to recompile applications
 Standard system calls that access remote files can be routed

through the NFS client module by the kernel


 Shared cache of recently-used blocks at client
 Kernel-level server can access i-nodes and file blocks directly
 Security of the encryption key used for authentication

18
NFS Server Operations

• read(fh, offset, count) -> attr, data fh = fileModel flat file service
handle:
• write(fh, offset, count, data) -> attr Read(FileId, i, n) -> Data
Filesystem identifier i-node number i-node generation

create(dirfh, name, attr) -> newfh, attr Write(FileId, i, Data)
• remove(dirfh, name) status Create() -> FileId
• getattr(fh) -> attr Delete(FileId)
• setattr(fh, attr) -> attr GetAttributes(FileId) -> Attr
• lookup(dirfh, name) -> fh, attr SetAttributes(FileId, Attr)
• rename(dirfh, name, todirfh, toname) Model directory service
• link(newdirfh, newname, dirfh, name) Lookup(Dir, Name) -> FileId
• readdir(dirfh, cookie, count) -> entries AddName(Dir, Name, File)
• symlink(newdirfh, newname, string) -> UnName(Dir, Name)
status GetNames(Dir, Pattern) 
• readlink(fh) -> string ->NameSeq
• mkdir(dirfh, name, attr) -> newfh, attr
• rmdir(dirfh, name) -> status
• statfs(fh) -> fsstats 19
NFS Access Control and Authentication

 Stateless server, so the user's identity and access rights must


be checked by the server on each request.
 Every client request is accompanied by the userID and groupID
 Kerberos has been integrated with NFS to provide a stronger
and more comprehensive security solution

20
Architecture Components 
(UNIX / Linux)
 Server:
 nfsd: NFS server daemon that services requests from

clients.
 mountd: NFS mount daemon that carries out the mount

request passed on by nfsd.


 rpcbind: RPC port mapper used to locate the nfsd

daemon.
 /etc/exports: configuration file that defines which portion

of the file systems are exported through NFS and how.


 Client:
 mount: standard file system mount command.

 /etc/fstab: file system table file.

 nfsiod: (optional) local asynchronous NFS I/O server.


21
Mount Service

 Mount operation:
mount(remotehost, remotedirectory,
localdirectory)
 Server maintains a table of clients who have
mounted filesystems at that server
 Each client maintains a table of mounted file
systems holding:
< IP address, port number, file handle>

22
Local and Remote File Systems Accessible on an
NFS Client

Server 1 Client Server 2

(root) (root) (root)

export ... vmunix usr nfs

Remote Remote
people students x staff users
mount mount

big jon bob ... jim ann jane joe

23
Automounter

NFS client catches attempts to access 'empty' mount points


and routes them to the Automounter
 Automounter has a table of mount points and multiple

candidate servers for each


 it sends a probe message to each candidate server and

then uses the mount service to mount the filesystem at


the first server to respond
 Keeps the mount table small

 Provides a simple form of replication for read-only


filesystems
 E.g. if there are several servers with identical copies of

/usr/lib then each server will have a chance of being


mounted at some clients.
24
NFS Optimization - Server Caching

 Similar to UNIX file caching for local files


 pages (blocks) from disk are held in a main memory buffer cache until the space
is required for newer pages. Read-ahead and delayed-write optimizations.
 For local files, writes are deferred to next sync event (30 second intervals)
 Works well in local context, where files are always accessed through the local
cache, but in the remote case it doesn't offer necessary synchronization
guarantees to clients.
 NFS v3 servers offers two strategies for updating the disk
 write-through - altered pages are written to disk as soon as they are received at
the server.
 delayed commit - pages are held only in the cache until a commit() call is
received for the relevant file. This is the default mode used by NFS v3 clients. A
commit() is issued by the client whenever a file is closed.

25
NFS Optimization - Client Caching

 Server caching does nothing to reduce the traffic between


client and server
 further optimization is essential to reduce server load in large networks

 NFS client module caches the results of read, write, getattr, lookup and readdir
operations
 synchronization of file contents (one-copy semantics) is not guaranteed when
two or more clients are sharing the same file.
 Timestamp-based validity check
 reduces inconsistency, but doesn't eliminate it
 validity condition for cache entries at the client: t freshness guarantee
Tc time when cache entry was
(T - Tc < t) v (Tmclient = Tmserver) last validated
 t is configurable (per file) but is typically set to Tm time when block was last
3 seconds for files and 30 secs. for directories updated
 it remains difficult to write distributed 
T current time
applications that share files with NFS
26
Summary

 Distributed File systems provide illusion of a local file system


and hide complexity from end users.
 Sun NFS is an excellent example of a distributed service
designed to meet many important design requirements
 Effective client caching can produce file service performance
equal to or better than local file systems
 Consistency versus update semantics versus fault tolerance
remains an issue
 Most client and server failures can be masked

27

You might also like