You are on page 1of 16

Andrew File System (AFS)

Andrew was a project of Carnegie Mellon University


(CMU) to develop a distributed computing environment
on the Carnegie Mellon campus in the mid 1980s.

Module 4
AFS
• AFS provides transparent access to remote shared files
for UNIX programs running on workstations.
• AFS is most commonly used in academic and research
environments.
• Introduced by researchers at CMU in the 1980’s .
• Goals of AFS:
– Create a scalable DFS to support as many clients as possible.
– Design and implementation of protocol between clients and
servers in an DFS environment.
– Implement simple cache consistency.
– Overcome the limited scalability problem facing NFS.
(NFS forces clients to check with server periodically to
determine if cached contents have changed; each check uses
resources (CPU, Bandwidth). Thus, frequent checks will limit
the no. of clients a server can handle
Andrew File System (AFS)

• The key strategy for achieving scalability is the


caching of whole files in client nodes.
• design characteristics for AFS
– whole-file serving:
entire contents of directories and files
transfered from server to client
– whole file caching: when file transferred to
client it will be stored on that client’s local disk.
The entire file is cached in the local machine
cache, reducing file-open latency, and frequent
read/write requests to the server.
AFS:How it works
• All the files in AFS are distributed among the servers.
• Basic idea is whole-file caching on the local disk of
the client machine.
– When you open() a file, the entire file is fetched from
the server and stored on your local disk.
– Subsequent read() and write() operations are done
locally without any network communication.
– In other words, treated as LFS.
– Locally the file is cached as blocks in client memory
– Once a close() command is issued, the file (if modified)
is flushed back to the server.
AFS Architecture
AFS Implementation
• AFS is implemented as two software
components that exist as UNIX processes
called Vice and Venus
• The key software components in AFS are:
• Vice: The server side process that resides on 
top of the unix kernel, providing shared file
services to each client
• Venus: The client side cache manager which
acts as an interface between the application
program and the Vice. Venus corresponds to
the client module.
AFS client
• Functionalities of VENUS:
– The responsibilities of the VENUS include 1.
retreiving files from servers,
2.Maintaining a local file cache,
3. Translating file requests into remote procedure calls,
and
4.Storing callbacks.
AFS client
• The strategy chosen for AFS was to cache files locally
on the clients.
• When a file is opened,
– the Venus first checks if there is a valid copy of the file in
the cache.
– If this is not the case, the file is retrieved from a file server.
– A file that has been modified and closed on the client is
transferred back to the server.
• Over time, the client builds up a "working set" of often-
accessed files in the cache. A typical optimal cache
size is 100 MB, but this depends on what the client is
used for.
• As long as the cached files are not modified by other
users, they do not have to be fetched from the server
when subsequently accessed.
• This reduces the load on the network significantly.
AFS client – Call back mechanism
& Cache consistency
• Volumes :An AFS volume is a logical unit
of disk space that functions like a
container for the files in an AFS directory,
keeping them all together on one partition
of a file server machine.
• To make a volume's contents visible in
the cell's file tree and accessible to users,
you mount the volume at a directory
location in the AFS filespace.
• In AFS,the representation of fids includes the
volume number for the volume containing the
file (the file group identifier in UFIDs (NFS))

• The Vice servers accept requests only in


terms of fids.
• Venus translates the pathnames supplied by
clients into fids using a step-by-step lookup to
obtain the information from the file
directories held in the Vice servers.
AFS client – Call back mechanism
& Cache consistency
• To ensure that files are kept up-to-date, a callback mechanism is used.

• When Vice supplies a copy of a file to a Venus process it also provides


a callbackpromise.

• A callback is a remote procedure call from a server to a Venus process.


callbackpromise– a token issued by the Vice server that is the
custodian of the file, guaranteeing that it will notify the Venus process
when any other client modifies the file.
• Callback promises are stored with the cached files on the workstation
(client)disks and have two states: valid or cancelled.
• When a server performs a request to update a file it notifies all of the
Venus processes to which it has issued callback promises by sending
a callback to each .
• When the Venus process receives a callback, it sets the callback
promise token for the relevant file to cancelled.
AFS client – Call back mechanism
& Cache consistency
• Whenever Venus handles an open on behalf of
a client,
– it checks the cache.
– If the required file is found in the cache, then its
token is checked.
– If its value is cancelled, then a fresh copy of the
file must be fetched from the Vice server,
– but if the token is valid,then the cached copy can
be opened and used without reference to Vice.
– Update semantics :the callback promise
mechanism maintains a well-defined
approximation to one-copy semantics.
AFS server
Crash recovery
Client Recovery :
Imagine the
Scenario
(
Server S with two clients C1 and C2)
• C1,C2 and S had the same cached copy of file F.
• C1 has rebooted the machine.
• Meanwhile,C2 has updated the file F and flushed the new
version of the file back to S.
• S tried to contact C1 via callback in order to inform C1 to
invalidate its copy off. However, since C1 is rebooting it is
unable to receive the invalidation messages from S.
• Now that C1 is back online,
– C1 should treat all its cached contents as suspect.
– Thus,C1 will send TestAuth protocol message to S to check
whether its copy of F is still valid for use;
– if not, C1 will fetch then newer version of F from S.
• Server recovery

– When a crashed server reboot, it has no idea


which client has which files.
– Each client must realize that the server has
crashed, and thus treat all their cached
contents as suspect.
• Can be implemented in two ways:
– Server send messages to all clients “Don’t trust
your cache contents” after it recovers from the
crash.
– Clients check that the server is alive
periodically

You might also like