You are on page 1of 14

Distributed Databases and

Client Server Architectures


Distributed Database Concept
A distributed database transfers data to be stored at multiple computers or
places linked to one another across a network. Another way to explain it is as
a database that uses separate computers connected by data communication
links to acquire data from several databases. Distributed database systems
can provide greater availability and reliability than centrally located database
systems. This is done to ensure that the system can still work even if one or
more sites are unavailable. By splitting the workload and using the data over
several sites, a distributed database system may operate more efficiently.

Accessing Distributed Databases


Different applications can be used by users to access a DDBMS, depending
on their data needs. Applications can be divided into two categories: local
and global.

Local applications: These apps may run entirely on the data present at the local
location and do not require data from other sites. For instance, a local
application may consist of producing reports using sales information kept
on−site.

Global Applications: To complete their tasks, global applications need data from
many places. These programs are made to access and work with dispersed
data that is spread across several websites. For instance, a worldwide
application would require compiling data from several regional databases and
analyzing user behavior.

Operations on Distributed Database


Distributed databases enable a wide range of operations to alter and manage
data, just like any other database. Create, Retrieve, Update, and Delete
(CRUD) are some of these operations. Let's examine each procedure in
further detail:

MAM COLLEGE Page 1


Create: It involves defining the database structure and providing the data into
it. Defining data structures, data types, and constraints for the data to be
stored is part of this.

Retrieve: It is used to retrieve information from the database. Conditions are


provided in case of specific data retrieval.

Update: It allows us to modify the existing data in the database. Users can
update or change values in rows and columns both.

Delete: It is used to remove data from the database. Conditions are provided
to delete specific data in a database.

Distributed databases also support advanced operations such as indexing,


data replication, and backup and recovery. These operations ensure data
integrity, performance, and availability across the distributed environment.

Distributed Processing
A centralized database that is accessible from several locations over a
computer network is referred to as distributed processing. In this case, the
data is still centralized and accessible to other users via the network. The
data is not physically distributed across many sites, hence this is not a
distributed database.

Exploring Parallel DBMS Architectures


Parallel Database Management Systems (DBMS) is a specialized type of
DBMS that leverage multiple processors to execute operations in parallel.
This parallel processing capability enables faster data processing and
improved performance. The three main architectures for Parallel DBMS are:

Shared Memory Architecture: In this highly coupled architecture, multiple


processors within a single system share system memory. It is also known as
symmetric multiprocessing (SMP). Shared memory architecture is commonly
found in personal workstations that support a few microprocessors in
parallel.

Shared Disk Architecture: For centralized applications that demand good


availability and performance, this loosely connected design is utilized.

MAM COLLEGE Page 2


Although each CPU has its private memory, it can access all disks directly.
When each CPU is linked to a common disk, it is also known as a cluster.

Shared Nothing Architecture: In this multiple−processor architecture, each


processor is a fully functional system with its own memory and disk storage.
Each CPU functions independently and has its resources. This architecture,
commonly referred to as Massively Parallel Processing (MPP), offers fault
tolerance and scalability.

Characteristics of Distributed Databases


Logically Related Shared Data: The data in a distributed database are logically
connected. This enables complicated data analysis and processing since
many database pieces include connected information.

Data Fragmentation: The process of dividing the database into smaller multiple
parts or sub−tables is called fragmentation. The smaller parts or sub−tables
are called fragments and are stored at different locations. Data
fragmentation should be done in a way that the reconstruction of the original
parent database from the fragments is possible.

Duplicate Fragments: The database may contain certain fragments that are
repeated on other websites in specific circumstances. Redundancy enables
fault tolerance and data availability. The data may still be accessible from
another location if one site goes down.

Site Allocation: Within the distributed system, portions of the database are
assigned to certain sites. This allocation is dependent on variables including
network topology, performance requirements, and data proximity.

Centralized DBMS Control: Although each site in a distributed database can


independently handle user requests, the Distributed DBMS has overall control
and administration of the data. This guarantees uniformity and coordination
amongst the dispersed locations.

MAM COLLEGE Page 3


Data Fragmentation, Replication, and Allocation Techniques for
Distributed Database

Data Fragmentation
The process of dividing the database into smaller multiple parts or
sub−tables is called fragmentation. The smaller parts or sub−tables are
called fragments and are stored at different locations. Data fragmentation
should be done in a way that the reconstruction of the original parent
database from the fragments is possible. The restoration can be done using
UNION or JOIN operations.

Database fragmentation is of three types: Horizontal fragmentation, Vertical


fragmentation, and Mixed or Hybrid fragmentation.

Horizontal Fragmentation

It divides a table horizontally into a group of rows to create multiple


fragments or subsets of a table. These fragments can then be assigned to
different sites in the database. Reconstruction is done using UNION or JOIN
operations. In relational algebra, it is represented as σp(T) for any given
table(T).

Example

In this example, we are going to see how the horizontal fragmentation looks
in a table.

Input :

STUDENT

id name age salary


1 aman 21 20000
2 naman 22 25000
3 raman 23 35000
4 sonam 24 36000
MAM COLLEGE Page 4
Example
SELECT * FROM student WHERE salary<35000;
SELECT * FROM student WHERE salary>35000;

Output
id name age salary
1 aman 21 20000
2 naman 22 25000
id name age salary
4 soman 24 36000

There are three types of Horizontal fragmentation: Primary, Derived, and


Complete Horizontal Fragmentation

A: Primary Horizontal Fragmentation: It is a process of segmenting a single table


in a row−wise manner using a set of conditions.

Example

This example shows how the Select statement is used with a condition to
provide output.

SELECT * FROM student SALARY<30000;

Output
id name age salary
1 aman 21 20000
2 naman 22 25000

B: Derived Horizontal Fragmentation: Fragmentation that is being derived from


primary relation.

Example

This example shows how the Select statement is used with the where clause
to provide output.

SELECT * FROM student WHERE age=21 AND salary<30000;

Output
id name age salary

MAM COLLEGE Page 5


1 aman 21 20000

C: Complete horizontal fragmentation: It derives a set of horizontal fragments to


make the table have at least one partition.

Vertical Fragmentation

It divides a table vertically into a group of columns to create multiple


fragments or subsets of a table. These fragments can then be assigned to
different sites in the database. Reconstruction is done using full outer join
operation.

Example

This example shows how the Select statement is used to do the


fragmentation and to provide the output.

Input Table :

STUDENT

id name age salary


1 aman 21 20000
2 naman 22 25000
3 raman 23 35000
4 sonam 24 36000

Example
SELECT * FROM name;#fragmentation 1
SELECT * FROM id, age;#fragmentation 2

Output
name
aman
naman
raman
sonam
age
21
22

MAM COLLEGE Page 6


23
24
Mixed or Hybrid Fragmentation

It is done by performing both horizontal and vertical partitioning together. It


is a group of rows and columns in relation.

Example

This example shows how the Select statement is used with the where clause
to provide the output.

SELECT * FROM name WHERE age=22;

Output
name age
naman 22

Data Replication
Data replication means a replica is made i. e. data is copied at multiple
locations to improve the availability of data. It is used to remove
inconsistency between the same data which result in a distributed database
so that users can do their task without interrupting the work of other users.

Types of data replication :

Transactional Replication

It makes a full copy of the database along with the changed data.
Transactional consistency is guaranteed because the order of data is the
same when copied from publisher to subscriber database. It is used in
server−to−server environments by consistently and accurately replicating
changes in the database.

Snapshot Replication

It is the simplest type that distributes data exactly as it appears at a


particular moment regardless of any updates in data. It copies the 'snapshot'
of the data. It is useful when the database changes infrequently. It is slower

MAM COLLEGE Page 7


to Transactional Replication because the data is sent in bulk from one end to
another. It is generally used in cases where subscribers do not need the
updated data and are operating in read−only mode.

Merge Replication

It combines data from several databases into a single database. It is the


most complex type of replication because both the publisher and subscriber
can do database changes. It is used in a server−to−client environment and
has changes sent from one publisher to multiple subscribers.

Data Allocation
It is the process to decide where exactly you want to store the data in the
database. Also involves the decision as to which data type of data has to be
stored at what particular location. Three main types of data allocation are
centralized, partitioned, and replicated.

Centralises: Entire database is stored at a single site. No data distribution


occurs

Partitioned: The database gets divided into different fragments which are
stored at several sites.

Replicated: Copies of the database are stored at different locations to access


the data.

Types of Distributed Database Systems

There are two kinds of distributed database, viz. homogenous and


heterogeneous. The databases which have same underlying hardware and
run over the same operating systems and application procedures are known
as homogeneous DDB, for eg. All physical locations in a DDB. Whereas, the
operating systems, underlying hardware as well as application procedures
can be different at various sites of a DDB which is known as heterogeneous
DDB.

MAM COLLEGE Page 8


Types of Distributed DBMS
There are 6 types of DDBMS present there which are discussed below:
 Homogeneous: In this type of DDBMS, all the participating sites should
have the exact same DBMS software and architecture which makes all
underlying systems consistent across all sites. It provides simplified data
sharing and integration.
 Heterogeneous: In this type of DDBMS, the participating sites can be
from multiple sites and use different DBMS software, data models, or
architectures. This model faces little integration problem as all site’s data
representation and query language can be different from each other.
 Federated: Here, the local databases are maintained by individual sites
or federations. These local databases are connected via a middleware
system that allows users to access and query data from multiple
distributed databases. The federation combines different local databases
but maintains autonomy at the local level.
 Replicated: In this type, the DDBMS maintains multiple copies of the
same data fragment across different sites. It is used to ensure data
availability, fault tolerance, and seamless performance. Users can access
any data from the nearest replica if the root is down for some reason.
However, it is required to perform high-end synchronization of data
changes in replication.
 Partitioned: In a Partitioned DDBMS, the overall database is divided into
distinct partitions, and each partition is assigned to a specific site.
Partitioning can be done depending on specific conditions like date range,
geographic location, and functional modules. Each site controls its own
partition and the data from other partitions should be accessed through
communication and coordination between sites.
 Hybrid: It is just a combination of multiple other five types
of DDBMS which are discussed above. The combination is done to
address specific requirements and challenges of complex distributed
environments. Hybrid DDBMS provides more optimized performance and
high scalability.

MAM COLLEGE Page 9


Note: Heterogenous DDMS have local users while Homogenous DDMS does not have local
users

Distributed Data Storage


In this section we will talk about data stored at different sites in distributed database
management system.

 There are two ways in which data can be stored at different sites. These are,
1. Replication.
2. Fragmentation.

Replication

 As the name suggests, the system stores copies of data at different


sites. If an entire database is available on multiple sites, it is a fully
redundant database.
 The advantage of data replication is that it increases availability of
data on different sites. As the data is available at different sites, queries
can be processed parallelly.
 However, data replication has some disadvantages as well. Data needs
to be constantly updated and synchronized with other sites, if any site
fails to achieve it then it will lead to inconsistencies in the
database. Availability of data is highly benefitted from Replication.
 Constant updation complicates concurrency control and it is also
overhead for the servers.

MAM COLLEGE Page 10


Three-Tier Client Server Architecture in
Distributed System
The most common type of multi-tier architecture in distributed systems is a
three-tier client-server architecture. In this architecture, the entire application
is organized into three computing tiers
 Presentation tier
 Application tier
 Data-tier
The major benefit of the three tiers in client-server architecture is that these
tiers are developed and maintained independently and this would not impact
the other tiers in case of any modification. It allows for better performance
and even more scalability in architecture can be made as with the increasing
demand, more servers can be added.
What is Three-Tier Architecture?
Three-Tier Architecture is an is an well established software application
design pattern which will organizes the application in the three logical and
physical computing tiers as following:
 Presentation Tier
 Application Tier
 Data Tier
The Three Tiers In Detail
Presentation Tier
It is the user interface and topmost tier in the architecture. Its purpose is to
take request from the client and displays information to the client. It
communicates with other tiers using a web browser as it gives output on the
browser. If we talk about Web-based tiers then these are developed using
languages like- HTML, CSS, JavaScript.
Application Tier
It is the middle tier of the architecture also known as the logic tier as the
information/request gathered through the presentation tier is processed in detail here.
It also interacts with the server that stores the data. It processes the client’s request,

MAM COLLEGE Page 11


formats, it and sends it back to the client. It is developed using languages like-
Python, Java, PHP, etc.
Data Tier
It is the last tier of the architecture also known as the Database Tier. It is used to store
the processed information so that it can be retrieved later on when required. It consists
of Database Servers like- Oracle, MySQL, DB2, etc. The communication between the
Presentation Tier and Data-Tier is done using middle-tier i.e. Application Tier.

Tier vs. layer


Tier Layer

Tier refer to the physical operation of Layer refers to the logical separation
components. of an application.

Tiers are physical separated and Layers are logically separated but
running on the different machines are running on the same servers or the
servers. machines.

Scalability of an application is very Scalability of an application is


high. medium.

Common tiers in a multi-tier


Each layer focuses on specific
architecture include the presentation
responsibilities, such as
tier (user interface), application tier
presentation, business logic, and
(business logic), and data tier
data access, within a single tier.
(database).

Three-Tier Application In Web Development


Web application will have the same tiers as the three architectures but only
difference is that they are with different names.
1. Web Server: Web server is an presentation tier of an three-tier
architecture which servers as an presentation tier or UI (User Interface)

MAM COLLEGE Page 12


tier. This content can be static or dynamic based on there requirement
such as an e-commerce site where users can add products to their
shopping cart, enter payment details, or create an account.
2. Application Server: Application tier is an middle tier which consists the
full logical that is required for the business which will process the user
inputs and so on. It will performs the logic’s like queries the inventory
database to return product availability or adds details to a customer’s
profile.
3. Database server: This is back-end tier of an web application it consists
all the data so this tier mostly there will be data base such as MySQL,
Oracle, DB2 or PostgreSQL.
Other Multi-Tier Architectures
Three Tier Architecture is the widely used for the application development
there are some other architecture as mentioned below.
Two-Tier Architecture
This is an fundamental software architecture which consists of two layers or
tier one is client and another is server each tier has there own responsibility
and they are interlinked with each other or dependent on each other
Client Tier
Client tier is the top most layer which is like an user interface and interaction
part of the application. it acts as an web browser, desktop application, or
mobile app, through which the user interacts with the application. It is
responsible for presenting the data and processing the input of the
application.
Server Tier
Server tier is an bottom layer which contains the logic that us required to
handle the data processing and data management.
N-Tier Architecture
N-Tier Architecture, also known as Multi-Tier Architecture it divides the
application into various number of tiers based on there complexity and
requirements. following are the some of the tiers included in the architecture.
 Presentation Tier (Client Tier)
 Application Tier (Middle Tier or Business Logic Tier)
 Data Tier (Data Storage Tier or Database Tier)
 Services Tier (Business Services or Application Services)
 Integration Tier (Integration Services).
Benefits of Three-Tier Architecture
 Logical separation is maintained between Presentation Tier, Application
Tier, and Database Tier.
 Enhancement of Performance as the task is divided on multiple machines
in distributed machines and moreover, each tier is independent of other
tiers.
 Increasing demand for adding more servers can also be handled in the
architecture as tiers can be scaled independently.

MAM COLLEGE Page 13


 Developers are independent to update the technology of one tier as it
would not impact the other tiers.
 Reliability is improved with the independence of the tiers as issues of one
tier would not affect the other ones.
 Programmers can easily maintain the database, presentation code, and
business/application logic separately. If any change is required in
business/application logic then it does not impact the presentation code
and codebase.
 Load is balanced as the presentation tier task is separated from the
server of the data tier.
 Security is improved as the client cannot communicate directly with
Database Tier. Moreover, the data is validated at Application Tier before
passing to Database Tier.
 The integrity of data is maintained.
 Provision of deployment to a variety of databases rather than restraining
yourself to one particular technology.
Disadvantages of Three-Tier Architecture
 The Presentation Tier cannot communicate directly with Database Tier.
 Complexity also increases with the increase in tiers in architecture.
 There is an increase in the number of resources as codebase,
presentation code, and application code need to be maintained
separately.

MAM COLLEGE Page 14

You might also like