You are on page 1of 3

Parallel and Distributed Computing

Quiz #3
ID: 203333
M.Raees Khan

Question NO1:

Compare and contrast Shortest Job First (SJF) and Shortest Remaining Time First (SRTF)

scheduling algorithms. When would you prefer one over the other?

The Preemptive version of Shortest Job First(SJF) scheduling is known as Shortest Remaining Time First
(SRTF). With the help of the SRTF algorithm, the process having the smallest amount of time remaining
until completion is selected first to execute.

So basically in SRTF, the processes are scheduled according to the shortest remaining time.

However, the SRTF algorithm involves more overheads than the Shortest job first (SJF)scheduling,
because in SRTF OS is required frequently in order to monitor the CPU time of the jobs in the READY
queue and to perform context switching.

In the SRTF scheduling algorithm, the execution of any process can be stopped after a certain amount of
time. On arrival of every process, the short-term scheduler schedules those processes from the list of
available processes & running processes that have the least remaining burst time.

After all the processes are available in the ready queue, then, No preemption will be done and then the
algorithm will work the same as SJF scheduling. In the Process Control Block, the context of the process
is saved, when the process is removed from the execution and when the next process is scheduled. The
PCB is accessed on the next execution of this process.

Advantages of SRTF

The main advantage of the SRTF algorithm is that it makes the processing of the jobs faster than the SJF
algorithm, mentioned it’s overhead charges are not counted.
Disadvantages of SRTF

In SRTF, the context switching is done a lot more times than in SJN due to more consumption of the
CPU's valuable time for processing. The consumed time of CPU then adds up to its processing time and
which then diminishes the advantage of fast processing of this algorithm

Question NO2:

Explain the concept of data replication in distributed storage systems. What are the benefits

and challenges of data replication?

Data replication in distributed storage systems involves storing multiple copies of the same data across
different nodes or storage devices within the distributed system. Each copy of the data is referred to as a
replica, and the replication process ensures redundancy and fault tolerance by maintaining multiple
synchronized copies of data.

Benefits of Data Replication:

Fault Tolerance: Data replication enhances system reliability and fault tolerance by providing redundancy.
If one replica becomes unavailable due to node failure or data corruption, the system can continue to
operate using other available replicas.

Improved Availability: Replicating data across multiple nodes ensures high availability of data, even in the
presence of node failures or network partitions. Clients can access data from nearby replicas, reducing
access latency and improving responsiveness.

Load Balancing: Distributing data replicas across multiple nodes helps distribute read and write
operations, reducing the load on individual nodes and improving overall system performance.

Enhanced Scalability: Data replication facilitates horizontal scalability by allowing the distributed system
to scale out by adding more nodes. As the system grows, additional replicas can be created to
accommodate increased data storage and access demands.
Geographic Redundancy: Replicating data across geographically distributed locations provides
geographic redundancy, enabling disaster recovery, data localization, and compliance with regulatory
requirements.

Challenges of Data Replication:

Consistency Management: Maintaining consistency among replicas poses a significant challenge in


distributed storage systems. Ensuring that all replicas are synchronized and reflect the latest updates
requires coordination mechanisms, such as distributed consensus protocols or replication protocols.

Storage Overhead: Replicating data incurs storage overhead, as multiple copies of the same data need to
be stored across different nodes. This can lead to increased storage costs and resource utilization,
especially for large-scale distributed systems.

Data Access Latency: Accessing data from remote replicas may introduce additional latency, especially in
geographically distributed systems. Strategies such as data locality optimization and caching can help
mitigate latency issues.

Network Bandwidth Usage: Data replication requires transferring data between nodes over the network,
leading to increased network bandwidth usage. Network congestion and latency can impact replication
performance and system throughput.

Consistency-Performance Trade-offs: Achieving strong consistency guarantees across replicas may incur
performance penalties, such as increased latency or reduced throughput. Balancing consistency
requirements with performance considerations is essential in designing replication protocols.

You might also like