You are on page 1of 3

Questions on Google File System

Q1: Why was GFS created when there were so many other file systems already? What is the
rationale behind creating such a file system?

Ans1: GFS shares many of the same goals as earlier distributed file systems such as performance,
scalability, reliability, and availability, however its design was driven by the observations of
Google’s application workloads and technological environment, both current and anticipated. It
reflected a marked departure from some earlier file system assumptions and led Google to
reexamine traditional choices and explore radically different design points.

Following are the main considerations that Google took into account while designing GFS:

1.) Component failures are quite common


a. While designing GFS, Google kept in mind that component failures are the norm
rather than the exception. The file system consists of hundreds or even thousands
of storage machines built from inexpensive commodity parts and is accessed by a
comparable number of client machines. The quantity and quality of the components
virtually guarantee that some are not functional at any given time and some will not
recover from their current failures. Google had seen problems caused by application
bugs, operating system bugs, human errors, and the failures of disks, memory,
connectors, networking, and power supplies. Therefore, constant monitoring, error
detection, fault tolerance, and automatic recovery must be integral to the system.
2.) Files are huge by traditional standards
a. Multi-GB files are common. Each file typically has many application objects such
as web documents. When we continually work with fast growing data sets of many
TBs comprising billions of objects, it is unwieldy to manage billions of approx.
KB-sized files even when the file system could support it. As a result, design
assumptions and parameters such as I/O operation and block sizes must be revisited.
3.) Most files are mutated by appending new data rather than overwriting existing data
a. Random writes within a file are practically non-existent. Once written, the files are
only read, and often only sequentially. A variety of data share these characteristics.
Some may constitute large repositories that data analysis programs scan through.
Some may be data streams continuously generated by running applications. Some
may be archival data. Some may be intermediate results produced on one machine
and processed on another, whether simultaneously or later in time. Given this
access pattern on huge files, appending becomes the focus of performance
optimization and atomicity guarantees, while caching data blocks in the client loses
its appeal.
4.) Co-designing the applications and the file system API benefits the overall system by
increasing our flexibility
a. Google relaxed GFS’s consistency model to vastly simplify the file system without
imposing an onerous burden on the applications. They also introduced an atomic
append operation so that multiple clients can append concurrently to a file without
extra synchronization between them. Multiple GFS clusters are currently deployed
for different purposes. The largest ones have over 1000 storage nodes, over 300 TB
of disk storage, and are heavily accessed by hundreds of clients on distinct
machines on a continuous basis.

Comparison of Google File Systems with other systems:

i) GFS provides location independent namespace which enables data to be move


transparently for load balance and fault tolerance like Andrew File Systems (AFS).

ii) GFS spreads data across storage servers unlike AFS which reads from the same server

iii) GFS uses simple file replication unlike RAID where file replication is complex

iv) GFS does not provide caching below the filesystem while Sun Network File System
(SNFS) does.

v) GFS has single master, rather than distributed.

vi) HDFS (Hadoop) is an open source implementation of Google File System written in Java.
It follows the same overall design, but differs in supported features and implementation
details:
a.) Does not support random writes
b.) Does not support appending to existing files
c.) Does not support multiple concurrent writers

Q2: Why GFS is extremely scalable?

Ans2: Modularity allows GFS to easily expand to account for increasing amounts of data and
users. The paper states that currently the system accounts for approximately 300 TB of information
however, the system is designed such that adding more chunkservers can be accomplished without
significantly modifying the master server. Further, the decentralized method of data access that
primarily involves chunkserver-application interaction alleviates significant bottlenecks at the
master. The combination of extensibility as well as performance across increasing amounts of
users and data means that the system is entirely scalable within the Google context.

Q3: Explain the key components for GFS architecture.

Ans3: GFS architecture and components:

The GFS is composed of clusters. A cluster is a set of networked computers. GFS clusters contain
three types of interdependent entities which are: Client, master and chunk server. Clients could be:
Computers or applications manipulating existing files or creating new files on the system. The
master server is the orchestrator or manager of the cluster system that maintain the operation log.
Operation log keeps track of the activities made by the master itself which helps reducing the
service interruptions to a minimum level. At startup, master server retrieves information about
contents and inventories from chunk servers. Then after, the master server keeps tracks of the
location of the chunks with the cluster. The GFS architecture keeps the messages that the master
server sends and receives very small. The master server itself doesn’t handle file data at all, this is
done by chunk servers. Chunk servers are the core engine of the GFS. They store file chunks of
64 MB size. Chunk servers coordinate with the master server and send requested chunks to clients
directly.

GFS replicas: The GFS has two replicas: Primary and secondary replicas. A primary replica is
the data chunk that a chunk server sends to a client. Secondary replicas serve as backups on other
chunk servers. The master server decides which chunks act as primary or secondary. If the client
makes changes to the data in the chunk, then the master server lets the chunk servers with
secondary replicas, know they have to copy the new chunk off the primary chunk server to stay in
its current state.

Q4: How GFS handles a node failure?

Ans4: In GFS, the chunked data is replicated on 3 different chunkservers by default and the
replication can even be increased later. So, in case of a node failure, there are still at least 2 nodes
having the same data as stored in the failed node. Node failure is detected when the
datanode/chunkserver fails to send the heartbeat to the GFS master and when it happens, the GFS
master/namenode decreases the replica counts for each of its blocks and then performs replication
on a different datanode/chunkserver.

Master state is also replicated for reliability on multiple machines, using the operation log and
checkpoints.

i) If master fails, GFS can start a new master process at any of these replicas and modify
DNS alias accordingly

ii) “Shadow” masters also provide read-only access to the file system, even when primary
master is down

a. They read a replica of the operation log and apply the same sequence of changes
b. Not mirrors of master – they lag primary master by fractions of a second
c. This means we can still read up-to-date file contents while master is in recovery!

You might also like