Professional Documents
Culture Documents
14 Administering Databases
14 Administering Databases
Slide 3 Slide 4
1
The DA and the DBA - I The DA and the DBA - II
• There is often a distinction between the roles of the Database • The DBA role is concerned more with managing a particular
Administrator and the Data Administrator, both of which have (sub)system on a particular platform
specific responsibilities that complement each other.
• The DBA will have roles both during the setting-up of a system, and,
– The DA has a more strategic and managerial role. later, while it is running.
• The DA will be responsible for the Information Strategy of the – with the DA, participate in deciding the information content
organisation as a whole (i.e. transcending the various subsystems) of the database
• and will be concerned with large-scale developments (e.g. moving – write the conceptual specification (“conceptual schema”)
subsystems between platforms, developing new subsystems, ensuring
resilience, developing & maintaining standards, policies & procedures – decide how the data should be stored: use the facilities of the
etc.) DBMS to provide the mapping from the logical to the
physical
Slide 7 Slide 8
The DA and the DBA - III Management tools used by the DBA
• The DBA will also • Any mature DBMS will provide tools to:
– write “sub-schemas” for user views – bulk load data from files in other formats
– document the views – restructure data: for example to distribute data across several
sites
– allocate ownership rights and duties – provide differential access to data in ways that can be
configured and restructured
– with the DA, adjudicate between different interests
– maintain dynamic backup and restore facilities to enable rapid
– use the DBMS management tools to monitor, tune, recovery of a DBMS whose platform crashes
reorganise, protect, backup, and reload – (these facilities may be automatic but configurable by the
DBA)
– give access to data about the DBMS, its contents, its users and
its performance (via the Data Dictionary)
– retune parameters to improve performance (eg, Hints in
ORACLE) Slide 10
Slide 9
Distributed Databases
• Distributed Database
– A logically interrelated collection of shared data (and a
description of this data), physically distributed over a
computer network.
– Could be across two computers in the same room or lots of
computers across the world
– Under the control of more than one CPU
• Distributed DBMS
– Software system that permits the management of the
distributed database and makes the distribution transparent to
users.
Slide 11 Slide 12
2
Advantages of DDBMS
Key Concepts • Reflects organizational structure — database fragments are located in
the departments they relate to.
• Local autonomy — a department can control the data about them (as
• Collection of logically-related shared data. they are the ones familiar with it.)
• Data split into fragments. • Improved availability — a fault in one database system will only affect
one fragment, instead of the entire database.
• Fragments may be replicated.
• Improved performance — data is located near the site of greatest
• Fragments/replicas allocated to sites. demand, and the database systems themselves are parallelized,
• Sites linked by a communications network. allowing load on the databases to be balanced among servers. (A high
load on one module of the database won't affect other modules of the
• Data at each site is under control of a DBMS. database in a distributed database.)
• DBMSs handle local applications autonomously. • Economics — it costs less to create a network of smaller computers
• Each DBMS participates in at least one global application. with the power of a single large computer.
• Modularity — systems can be modified, added and removed from the
distributed database without affecting other modules (systems).
Slide 13 Slide 14
Disadvantages of DDBMS
• Complexity — extra work must be done by the DBAs to ensure that the
DDBMS
distributed nature of the system is transparent. Extra work must also be done
to maintain multiple disparate systems, instead of one big one. Extra database
design work must also be done to account for the disconnected nature of the • DDBMS is a Distributed Database Management System with
database — for example, joins become prohibitively expensive when – Extended communication services.
performed across multiple systems.
• Economics — increased complexity and a more extensive infrastructure
– Extended Data Dictionary.
means extra labour costs. – Distributed query processing.
• Security — remote database fragments must be secured, and they are not – Extended concurrency control.
centralized so the remote sites must be secured as well. The infrastructure
must also be secured (e.g., by encrypting the network links between remote – Extended recovery services.
sites). • Plus all the functionality you would expect from a
• Difficult to maintain integrity — in a distributed database, enforcing integrity centralized DBMS
over a network may require too much of the network's resources to be feasible.
• Inexperience — distributed databases are difficult to work with, and as a
young field there is not much readily available experience on proper practice.
Slide 15 Slide 16
Slide 17 Slide 18
3
Which Type and Why? Example
• Heterogeneous DDBMS
– The different databases probably already exist and are in • A bank has a database of current account users, another with
use by different users mortgage holders and a third with customer relations data.
– It becomes desirable to join them into one apparently • If the marketing department want to see what services a customer
single database, without simply throwing them all away has bought, and when they last wrote to them, they have to look
the customer up in three different databases.
and starting again
• The databases can become out of synch, for example if a person
– They could each have a different DBMS
moves house and tells the bank once. The mortgage account DB
– They will be more than one DD, DA and DBA might get updated, but the current account DB might not, and the
– They may not be an easy way of matching fields from customer relations DB almost certainly will not.
one to equivalent fields in another • By joining the databases, the bank can have a ‘single customer
– Joining such databases is known as a Bottom Up join view’, which is something they like.
Slide 19 Slide 20
• A much more difficult problem is joining the data from more than one
DB, for example, the current account DB has William Smith, but
marketing communicate with Bill Smith.
Slide 21 Slide 22
• Horizontal partitioning is where different full records from a table are • Vertical partitioning is where different attributes are stored in different
stored in different locations. locations.
• Example: • Example:
Imagine the car dealership chain, Arnold Clark. They will use a dealer I probably appear on a few databases in the university: payroll, car
management system such as Kerridge. The tables in each dealership parking, timetabling, sports centre …
will have the same structure (homogeneous) but different customers If the university wanted to be able to make a single query and see
will appear in each. If head office want to write to every customer, everything they have about me, it would be from a number of different
regardless of where they bought the car, they need the DDBMS to be databases. Chances are, they couldn’t do it because the DBs are not
able to respond to a query that generates every customer across all the under a DDBMS.
databases in each dealership.
Slide 23 Slide 24