Professional Documents
Culture Documents
Distributed Databases
Section Content
• 2.1 Concepts
• 2.2 Advantages
• 2.3 Classification of Distributed Systems
• 2.4 Database Design
• 2.5 Distributed Query Processing
Client Interface
DBMS
Interface
Network
Client Interface
Client Interface
Client Interface
SQL Server
Network
SQL Server
SQL Server
DBMS
Interface
• Each site may have copies of data which ‘belong’ to other sites: will
require synchronisation of updates.
Oracle
Network
VMS Mainframe DBMS Interface
Oracle
DBMS
Interface
2.1 Concepts
• 2.2 Advantages
• 2.3 Classification of Distributed Systems
• 2.4 Database Design
• 2.5 Distributed Query Processing
• Improved Performance
+ A distributed DBMS fragments so that data is stored at the site where it is
needed most.
+ Fragmentation also implies that the database is smaller: instead of a single
CPU processing one large database, multiple CPUs process many smaller
databases.
+ Inter-query and intra-query parallelism can be achieved as multiple queries can
be run in parallel at separate sites.
• Easier Expansion
+ Expansion is easier as it may involve adding a new site.
+ Expansion can be planned to suit the current distribution scheme.
2.1 Concepts
2.2 Advantages
• 2.3 Classification of Distributed Systems
• 2.4 Database Design
• 2.5 Distributed Query Processing
Distributed
Heterogeneous
DBMSs
Logically integrated
Single site homogeneous
Homogeneous
Federated DBMSs
Multiple DBMSs
Autonomy
Heterogeneous
integrated DBMSs
Heterogeneity
2.1 Concepts
2.2 Advantages
2.3 Classification of Distributed Systems
• 2.4 Database Design
• 2.5 Distributed Query Processing
• The first property looks at how data is shared between users; the
second looks at issues such as static and dynamic access patterns;
and the third looks at how much information is available regarding
access patterns.
• Why fragment ?
• How should fragmentation be performed ? (horizontally v vertical)
• How much should be fragmented? An important issue as it effects
the performance of query execution; aim to find a nice balance
between large and small units.
2.1 Concepts
2.2 Advantages
2.3 Classification of Distributed Systems
2.4 Database Design
• 2.5 Distributed Query Processing
• Properties
Size of EMPLOYEE is 10,000 * 100 = 1,000,000 bytes
Size of DEPARTMENT is 100 * 35 = 3,500 bytes
EMPLOYEE.DeptNo = DEPARTMENT.Dnumber
For each employee, retrieve the employee name and the department
in which that employee works.
• The result of the query will include 10,000 tuples (assuming that
every employee has a valid department). We know that 40 bytes
are required for each tuple in the result.
Site 1 Site 2
Employee Dept
D = {Dname, Dnumber,
MgrRSI, MRGStartdate}
Site 3
• For each department, retrieve the department name, and the name
of the department manager.
• Assume the query is again submitted at site 3, and that the result
contains 100 tuples (of 40 bytes).
Site 1 Site 2
Employee Dept
D = {Dname, Dnumber,
MgrRSI, MRGStartdate}
Site 3
Output from
Step 1
Output from
Step 2
Output from
Step 3