You are on page 1of 6

What is Centralized Database?

In a centralized database, all the data of an organization is stored in a single place such as a
mainframe computer or a server. Users in remote locations access the data through the
Wide Area Network (WAN) using the application programs provided to access the data. The
centralized database (the mainframe or the server) should be able to satisfy all the requests
coming to the system, therefore could easily become a bottleneck. But since all the data
reside in a single place it easier to maintain and back up data. Furthermore, it is easier to
maintain data integrity, because once data is stored in a centralized database, outdated
data is no longer available in other places.

Advantages:

Centralized databases hold a substantial amount of advantages against other types of


databases. Some of them are listed below:

 Data integrity is maximized and data redundancy is minimised, as the single storing
place of all the data also implies that a given set of data only has one primary record.
This aids in the maintaining of data as accurate and as consistent as possible and
enhances data reliability.
 Generally bigger data security, as the single data storage location implies only a one
possible place from which the database can be attacked and sets of data can be stolen
or tampered with.
 Better data preservation than other types of databases due to often-included fault-
tolerant setup.
 Easier for using by the end-user due to the simplicity of having a single database design.
 Generally easier data portability and database administration.
 More cost effective than other types of database systems as labor, power supply and
maintenance costs are all minimized.
 Data kept in the same location is easier to be changed, re-organized, mirrored, or
analyzed.

1
 All the information can be accessed at the same time from the same location.
 Updates to any given set of data are immediately received by every end-user.

Disadvantages:
Centralized databases also have a certain amount of limitations, such as those described
below:

 Centralized databases are highly dependent on network connectivity. The slower the
internet connection is, the longer the database access time needed will be.
 Bottlenecks can occur as a result of high traffic.
 Limited access by more than one person to the same set of data as there is only one
copy of it and it is maintained in a single location. This can lead to major decreases in
the general efficiency of the system.
 If there is no fault-tolerant setup and hardware failure occurs, all the data within the
database will be lost.
 Since there is minimal to no data redundancy, if a set of data is unexpectedly lost it is
very hard to retrieve it back, in most cases it would have to be done manually.
What is Distributed Database?

In a distributed database, the data is stored in storage devices that are located in different
physical locations. They are not attached to a common CPU but the database is controlled
by a central DBMS. Users access the data in a distributed database by accessing the WAN.
To keep a distributed database up to date, it uses the replication and duplication processes.
The replication process identifies changes in the distributed database and applies those
changes to make sure that all the distributed databases look the same. Depending on the
number of distributed databases, this process could become very complex and time
consuming. The duplication process identifies one database as a master database and
duplicates that database. This process is not complicated as the replication process but
makes sure that all the distributed databases have the same data.

A distributed database is a database that is under the control of a central database


management system (DBMS) in which storage devices are not all attached to a common

2
CPU. It may be stored in multiple computers located in the same physical location, or may
be dispersed over a network of interconnected computers.

Collections of data (e.g. in a database) can be distributed across multiple physical locations.
A distributed database can reside on network servers on the Internet, on corporate
intranets or extranets, or on other company networks. The replication and distribution of
databases improves database performance at end-user worksites.

To ensure that the distributive databases are up to date and current, there are two
processes: replication and duplication. Replication involves using specialized software that
looks for changes in the distributive database. Once the changes have been identified, the
replication process makes all the databases look the same. The replication process can be
very complex and time consuming depending on the size and number of the distributive
databases. This process can also require a lot of time and computer resources. Duplication
on the other hand is not as complicated. It basically identifies one database as a master and
then duplicates that database. The duplication process is normally done at a set time after
hours. This is to ensure that each distributed location has the same data. In the duplication
process, changes to the master database only are allowed. This is to ensure that local data
will not be overwritten. Both of the processes can keep the data current in all distributive
locations.

Besides distributed database replication and fragmentation, there are many other
distributed database design technologies. For example, local autonomy, synchronous and
asynchronous distributed database technologies. These technologies' implementation can
and does depend on the needs of the business and the sensitivity/confidentiality of the data
to be stored in the database, and hence the price the business is willing to spend on
ensuring data security, consistency and integrity.

Basic architecture:

A database User accesses the distributed database through:

Local applications;

Applications which do not require data from other sites.

Global applications:

3
Applications which do require data from other sites.

A distributed database does not share main memory or disks.

Advantages of Data Distribution

The primary advantage of distributed database systems is the ability to share and access
data in a reliable and efficient manner.

Data sharing and Distributed Control:

If a number of different sites are connected to each other, then a user at one site may be
able to access data that is available at another site. For example, in the distributed banking
system, it is possible for a user in one branch to access data in another branch. Without this
capability, a user wishing to transfer funds from one branch to another would have to resort
to some external mechanism for such a transfer. This external mechanism would, in effect,
be a single centralized database.

The primary advantage to accomplishing data sharing by means of data distribution is that
each site is able to retain a degree of control over data stored locally. In a centralized
system, the database administrator of the central site controls the database. In a distributed
system, there is a global database administrator responsible for the entire system. A part of
these responsibilities is delegated to the local database administrator for each site.
Depending upon the design of the distributed database system, each local administrator
may have a different degree of autonomy which is often a major advantage of distributed
databases.

Reliability and Availability:

If one site fails in distributed system, the remaining sited may be able to continue
operating. In particular, if data are replicated in several sites, transaction needing a
particular data item may find it in several sites. Thus, the failure of a site does not
necessarily imply the shutdown of the system.

The failure of one site must be detected by the system, and appropriate action may be
needed to recover from the failure. The system must no longer use the service of the failed
site. Finally, when the failed site recovers or is repaired, mechanisms must be available to
integrate it smoothly back into the system.

Although recovery from failure is more complex in distributed systems than in a centralized
system, the ability of most of the systems to continue to operate despite failure of one site,
results in increased availability. Availability is crucial for database systems used for real-time
applications. Loss of access to data, for example, in an airline may result in the loss of
potential ticket buyers to competitors.

Speedup Query Processing: If a query involves data at several sites, it may be possible to
split the query into subqueries that can be executed in parallel by several sites. Such parallel

4
computation allows for faster processing of a user’s query. In those cases in which data is
replicated, queries may be directed by the system to the least heavily loaded sites.

What is the difference between Distributed Database and Centralized


Database?

While a centralized database keeps its data in storage devices that are in a single location
connected to a single CPU, a distributed database system keeps its data in storage devices
that are possibly located in different geographical locations and managed using a central
DBMS. A centralized database is easier to maintain and keep updated since all the data are
stored in a single location. Furthermore, it is easier to maintain data integrity and avoid the
requirement for data duplication. But, all the requests coming to access data are processed
by a single entity such as a single mainframe, and therefore it could easily become a
bottleneck. But with distributed databases, this bottleneck can be avoided since the
databases are parallelized making the load balanced between several servers. But keeping
the data up to date in distributed database system requires additional work, therefore
increases the cost of maintenance and complexity and also requires additional software for
this purpose. Furthermore, designing databases for a distributed database is more complex
than the same for a centralized database.

5
The underlying idea of centralised databases is that they should be able to receive,
maintain, and complete every single request that the main system must perform by
themselves. There is only one database file, kept at a single location on a given network.

A distributed database, however, is a database in which all the information is stored on


multiple physical locations. Distributed databases are divided into two
groups: homogeneous and heterogeneous. It relies on replication and duplication within its
multiple sub-databases in order to maintain its records up to date. It is composed of
multiple database files, all controlled by a central DBMS.
The main differences between centralised and distributed databases arise due to their
respective basic characteristics. Differences include but are not limited to:

 Centralized databases store data on a single CPU bound to a single certain


physical/geographical location. Distributed databases, however, rely on a central DBMS
which manages all its different storage devices remotely, as it is not necessary for them
to be kept in the same physical and/or geographical location.
 As outlined above, centralised databases are easier to maintain up to date than
distributed databases. This is so because distributed databases require additional (often
manual) work to keep the data stored relevant, and to avoid data redundancy, as well as
to improve the overall performance.
 If data is lost in a centralised system, retrieving it would be much harder. If, however,
data is lost in a distributed system, retrieving it would be very easy, because there is
always a copy of the data in a different location of the database.
 Designing a centralised database is generally much less complex than designing a
distributed database, as distributed database systems are based on a hierarchical
structure.
Centralized DBMS Distributed DBMS
In Distributed DBMS the
In Centralized DBMS the
database are stored in different
database are stored in a only
site and help of network it can
one site
access it
Database and DBMS software
If the data is stored at a single
distributed over many sites,
computer site, which can be
connected by a computer
used by multiple users
network
Database is maintained at one Database is maintained at a
site number of different sites
If centralized system If one system fails,system
fails,entire system is halted continues work with other site
It is a less reliable It is a more reliable

You might also like