You are on page 1of 35

Database Management Systems

Chapter 10
Distributed Databases

Jerry Post
Copyright © 2003
1
D Distributed Databases
A  Definition
SELECT Sales
FROM Britain.Sales

T
UNION
 Advantages / Uses SELECT Sales
 FROM France.Sales
Problems / Complications
A
UNION
 Client-Server / SQL Server SELECT Sales
FROM Italy.Sales
 Microsoft Access
B Germany

A Britain

S France

E Italy

2
D Distributed Database Definition
A  Multiple independent
T databases
 Each DBMS is a complete Database
Database
Apollo

A DBMS (engine, queries,


locking, transactions, etc.)
Zeus
England
 Usually on different machines.
B  Usually in different locations.
 Connected by a network.
France

A  Might be different environments


 Hardware
Database
Athena
United States
S  Operating System
 DBMS Software

E
3
D Distributed Database Rules
A  C.J. Date  Distributed query processing.

T  Rule 0: Transparency: the


user should not know or care
 Distributed transaction
management.

A that the database is distributed. 


 Local autonomy.
Hardware independence.
 Operating system independence.
 Network independence.
B 

No reliance on a central site.
Continuous operation.  DBMS independence.

A
Location independence.
 Fragmentation independence
(physical storage).
S  Replication independence.

E
4
D Distributed Features
A  Each database can continue to run even if portion fails.
T  Data and hardware can be moved without affecting
operations or users.
A  Expanding operations.
 Performance issues.

B  System expansion and upgrades.


 Add new section without affecting others.

A  Upgrade hardware, network and DBMS.

S
E
5
D Advantages and Applications local

A  Business operations are


often distributed
transactions

 Work and data are


T segmented by department.
 Work and data are

A segmented by geographical
location.
 Improved performance
B  Most updates and queries
are performed locally.
A  Maintain local control and
responsibility over data.
future
expansion

S  Can still combine data


across the system.

E  Scalability and expansion


 Add on, not replacement.

6
D Creating a Distributed Database
A
 Design administration plan.
T  Choose hardware and DBMS vendor,
and network.
A  Set up network and DBMS connections.

B  Choose locations for data.


 Choose replication strategy.
A  Create backup plan and strategy.
 Create local views and synonyms.
S  Perform stress test: loads and failures.

E
7
D Distributed Query Processing
A  Networks are slow
T 

Drives: 20 - 60 MB per sec.
LANs: 1-10 MB per sec (10-100 mbps).

A 


WANs: 0.01 - 5 MB per sec.
Faster is possible but expensive!
WAN
SANs: 10-100 MB per sec.
B  Goal is to minimize transmissions.
 Each system must be capable of
0.1 - 5 MB

A evaluating queries--preferably SQL.


 Results depend heavily on how the
system joins tables.
S 10 - 20 MB
Disk drive
10-100 MB
LAN

E
8
D Distributed Query Processing
 Example
A  NY: Customers: 1 M rows
NY
Customers(C#, …)
 LA: Production: 10 M rows 1,000,000
T 

Chicago: Sales: 20 M rows
Query: List customers who
C# list from
desired P#
Chicago
A
Matching
bought blue products on March 1
Sales(S#, C#, Sdate) Customer
 Bad idea #1 20,000,000 data

B  Transfer all rows to Chicago


 Then JOIN and select.
SaleItem(S#, P#,…)
50,000,000

A
 Better idea #2 (probably) P# sold on
 Transfer blue products from LA March 1
to Chicago Blue P#
S  Better idea #3
 Get sale items on March 1
LA
sold on
March 1

E  Get blue products from LA


 Send C# to NY
Products(P#, Color…)
10,000,000

9
D Data Replication
Market research &
A  Goals
 Minimize transmissions
Britain
Britain: Customers
& Sales
data corrections.

T  Improve performance
 Support heavy multiuser
France: Customers
& Sales
access.
A  Problems Spain: Customers
& Sales
 Updating copies
B  Bulk transmissions
 Site unavailable
Periodic
updates

A  Concurrency Spain
Britain: Customers
 Easier for two people to & Sales
change the same data at
S the same time.
 Decision support systems.
France: Customers
& Sales

E  Data warehouse. Spain: Customers


& Sales Update data.

10
D Concurrency and Locks
A  Each DBMS must maintain DBMS #1

T lock facility.
 To update, each DBMS must
Jones
Accounts
8898

A utilize and recognize other


lock mechanisms and return Transaction A
codes.
B  Each DBMS must have a
Locked
Waiting

deadlock resolution protocol


A that recognizes the
distributed databases.
S  Random wait.
 Optimistic updates. DBMS #2 Transaction B
Waiting
Accounts
E  Two-phase commit. Jones 3561 Locked

11
D Transactions & Two-Phase Commit
A  Two (or more) separate lock
managers.

T  DBMS initiating update


serves as the coordinator.
Database 1
Initiate Transaction

A  Two phases
 Coordinator sends message
1. Prepare to commit.
All agree?
2. Commit
B
and data to all machines to
“get ready.”
 Local machines save data in

A logs, verify update status


and return message.

S
 If all locals report OK, then Database 2
coordinator writes log and Lock tables.
Database 3
instructs others to proceed. Save log.

E If any fail, it sends Rollback


message.
Update all tables.

12
D Distributed Transaction Managers
A Transaction Manager
Transaction Manager

T Resource
Manager Resource
Manager
A
DBMS
DBMS

B Transaction Manager

A Transaction
Processing
Resource
Manager

S Monitor The distributed transaction DBMS


coordinator/transaction processing monitor
handles the transaction decisions and
E coordinates across the participating
systems.

13
D Distributed Design Questions
A Qu e s tion Co n c u rre n t Re p lic a tio n

T Wh a t level of dat a con sist en cy is n eeded?


H ow expen sive is st or a ge?
Wh a t a r e t h e sh a r ed a ccess r equ ir em en t s?
H igh
Medium – H igh
Globa l
Low – Mediu m
Low
Loca l

A H ow oft en a r e t h e t a bles updat ed?


Requ ir ed speed of u pda t es (t r a n sa ct ion s)?
H ow im por t a n t a r e pr edict a ble t r a nsa ct ion t im es?
Often
F a st
H igh
Seldom
Slow
Low

B
DBMS suppor t for concu r r ency a n d lockin g? Good – E xcellen t P oor
Ca n sh a r ed access be a voided? No Yes

A
S
E
14
D Distributed Databases In Oracle
A  Database Links Schema.Table@Location
T  Full database names.
 CONNECT command.
Scott.Emp@hq.acme.com
Server

A  Linking through synonyms.


 CREATE SYNONYM …
database
Synonym:
Employee
B  Central control over permissions.
 Linking through Views/queries.
Procedure:
DELETE FROM
Employee

A  CREATE VIEW AS …
 Can assign local permissions.
WHERE ...

S  Linking through stored procedures.


 DELETE … View User can only
run procedure.
 Strong control over actions.
E user
permissions
No other access.

15
D Client-Server
A Server
T Server
Shared
Database
A
B Front-end
User Interface
A Clients Clients

S
E
16
D LAN File Server
A  Not a distributed database.
File Server
DBMS data file

T  Data file stored on server.


 Server is passive, appears
Application
Shared
Data

A as giant disk drive to PC.


 PC processes all data. All data from all tables are

B
 Retrieves all needed data read by PC, which performs
across the network. JOIN and WHERE test. If
available, reads index first.
 Performance improvements.
A  Indexes are crucial.
 Store some data on each

S PC (replication).
 Store applications on PC
SELECT Name, SaleDate
FROM Customer INNER JOIN Sales
ON Customer.C# = Sales.C#

E
(graphics & forms).
WHERE SaleDate BETWEEN #1-Mar-97#
 Convert to SQL-Server AND #9-Mar-97#;

17
D LAN File Server: Slow
A File Server

T MyFile.mdb
CustID Name …
115 Jenkins …
A Forms 125 Juarez ...
Order ...
B Application
and query

A DBMS
software
transferred.
One row at a time
transferred, until
S
transferred.
all rows are examined.
SELECT *
FROM Customer
E WHERE City = “Sandy”

18
D Client-Server Databases
A  One machine machine is DBMS
File Server

T dominant (server) and


handles data for many
SQL Server
Shared
Data

A clients.
 Client machines handle Return

..
front-end tasks and small
B
matching

CT .
data tables that are not Send SQL data.

E
statement.
shared.

L
A

SE
application

S
E
19
D ADO and Direct Connections
A The Database vendor
Server Computer
Database

T provides its own data


transport (e.g,. Oracle or
Server

A SQL Server) installed on DBMS transport


the server and the client.

s
CT

lt
B

Resu
ADO provides a driver that

E
SEL
connects your application to
the transport services.
A ODBC can serve as the
DBMS transport
ADO
S
data transport if nothing
else is available Visual Basic
application
E Client Computer

20
D Three-Tier Client-Server
A  Server Databases
Databases.
Database
Servers

T  Client front-end
 Middle
Transactions.
Legacy applications.

A  Locate databases
 Business rules
Database links.
Middleware
Business rules.
B  Program code Program code.

A Application.

S Front-end.
User Interface. Client

E
21
D Database Independence on the Client
A Original DBMS New DBMS

T
A
B
ADO ADO
A
S Application

E
22
D Database Independence with Queries
A Independent Application Query: works with any DBMS
SELECT SaleID, SaleDate, CustomerID, CustomerName
T FROM SaleCustomer

A Saved Oracle Query


SELECT SaleID, SaleDate, CustomerID,
B LastName || ‘, ‘ || FirstName AS CustomerName
FROM Sale, Customer

A
WHERE Sale.CustomerID=Customer.CustomerID

Saved SQL Server Query

S SELECT SaleID, SaleDate, CustomerID,


LastName + ‘, ‘ + FirstName AS CustomerName

E FROM Sale INNER JOIN Customer


ON Sale.CustomerID = Customer.CustomerID

23
D The Internet as Client-Server
A information

T
Internet
Router Router
Server
A Client
request
Web Server
Browser
B http://server.location/page
HTML pages
A Forms
Graphics

S
E
24
D HTML Limited Clients
<HTML>
A <HEAD>
<TITLE>My main page</TITLE></HEAD>

T <BODY BACKGROUND=“graphics/back0.jpg”>
<P>My text goes in paragraphs.</P>

A <P>Additional tags set <B>boldface</B> and <I>Italic</I>.


<P>Tables are more complicated and use a set of tags for rows and
columns.</P>
B <TABLE BORDER=1>
<TR><TD>First cell</TD><TD>Second cell</TD></TR>
A <TR><TD>Next row</TD><TD>Second column</TD></TR>
</TABLE>

S <P>There are form tags to create input forms for collecting data.
But you need CGI program code to convert and use the input data.</P>

E </BODY>
</HTML>

25
D HTML Output
A
T My t ext goes in pa r a gr a ph s.
Addit ion a l t a gs set b o ld fa c e a n d Italic.
A Ta bles a r e m or e com plica t ed a n d u se a set of t a gs
for r ow s a n d colu m n s.
B F ir st cell
N ext r ow
S econ d cell
S econ d colu m n
A Th er e a r e for m t a gs t o cr ea t e in pu t for m s for
collect in g da t a . Bu t you n eed CG I pr ogr a m code
S t o con ver t a n d u se t h e in pu t da t a .

E
26
D Web Server Database Fundamentals
A 0 Request Server/Form.html

T
3P
ag
Client/Browser e=
Te Database
mp
A 1 2 Data 3 2C
GI
St
rin
lat
e+
Re
su DBMS
g
B HTML
Form
1
Fo
rm
lt Result
Query
Web Server
A Result Page 1 2
S HTML
form
Query
Template

E Form.html
+ Code
Program code

27
D Database Example: Client Side
A 0 Request Server/Form.html
Server
T 1 Initial form

A Cal l AS
P p age 3 Results

2
B
A
S
E
28
D Client-Server Data Transfer
A Order Form
T Order ID 1015

A Customer
Order Date
Jones, Martha

12-Aug
B
A What if there are 10,000 customers?
How much time to load the combo box?
S How do you refresh/reload the combo box?
E Alternatives?

29
D Latency
A
T Server
Generate form Receive form data

A Transmission Transmission time


B delay
Form received
delay

A Client
User delay

S
E
30
D XML: Transferring Data
A
T Order: OrderID, OrderDate, ShippingCost, Comment
Item: ItemID, Description, Quantity, Cost
A Item: ItemID, Description, Quantity, Cost
B Item: ItemID, Description, Quantity, Cost

A
S
Many XML files contain hierarchical data.
E
31
D XML: Schema Definition xsd
<?xml version="1.0" encoding="utf-8"?>

A
<xs:schema id="OrderList" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
<xs:element name="OrderList" msdata:IsDataSet="true">
<xs:complexType> Partial file,

T
<xs:choice maxOccurs="unbounded">
<xs:element name="Order"> generated by
<xs:complexType> .NET xsd.exe
<xs:sequence>

A
<xs:element name="OrderID" type="xs:string" minOccurs="0" />
<xs:element name="OrderDate" type="xs:date" minOccurs="0" />
<xs:element name="ShippingCost" type="xs:string" minOccurs="0" />
<xs:element name="Comment" type="xs:string" minOccurs="0" />

B
<xs:element name="Items" minOccurs="0" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="ItemID" nillable="true" minOccurs="0" maxOccurs="unbounded">

A
<xs:complexType>
<xs:simpleContent msdata:ColumnName="ItemID_Text" msdata:Ordinal="0">
<xs:extension base="xs:string">
</xs:extension>

S
</xs:simpleContent>
</xs:complexType>
</xs:element>
<xs:element name="Description" nillable="true" minOccurs="0" maxOccurs="unbounded">

E
<xs:complexType>
<xs:simpleContent msdata:ColumnName="Description_Text" msdata:Ordinal="0">
<xs:extension base="xs:string">
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
32
D XML Data Example
A <?xml version="1.0"?>
<!DOCTYPE OrderList SYSTEM "orderlist.dtd">

T
<OrderList>
<Order>
<OrderID>1</OrderID>
<OrderDate>3/6/2004</OrderDate>

A <ShippingCost>$33.54</ShippingCost>
<Comment>Need immediately.</Comment>
<Items>
XML: extensible markup
language

B <ItemID>30</ItemID>
<Description>Flea Collar-Dog-
Medium</Description>

A
<Quantity>208</Quantity>
<Cost>$4.42</Cost>
<ItemID>27</ItemID>
<Description>Aquarium Filter &amp;

S Pump</Description>
<Quantity>8</Quantity>
<Cost>$24.65</Cost>

E
</Items>
</Order>
</OrderList>

33
D XML Example in Explorer
A
T
A
B
A
S
E
34
D Java and JDBC
A Connection con = DriverManager.getConnection(
"jdbc.myDriver:myDBName",
T “myLogin”,
“myPassword”);

A Statement smt = con.CreateStatement();


ResultSet rst = smt.executeQuery(
“SELECT AnimalID, Name, Category, Breed FROM
B Animal”);
while (rst.next()) {

A int iAnimal = rst.getInt(“AnimalID”);


String sName = rst.getString(“Name”);
String sCategory = rst.getString(“Category”);
S String sBreed = rst.getString(“Breed”);
\\ Now do something with these four variables
E }

35

You might also like