Distributed DBMS Architecture

Ali Samad

Architecture
Defines the structure of the system
– Defines the structure of the system – components identified – functions of each component defined – interrelationships and interactions between components defined

Ali Samad Tahir Rashid

DDBMS Architecture

2

Architecture
• Goal:
– present the issues that need to be addressed at design – present a framework within which the design and implementation issues can be discussed

• The ISO/OSI 7-layered reference model for computer networks

Ali Samad Tahir Rashid

DDBMS Architecture

3

A reference model can be described according to three different approaches: • Based on components • Based on functions • Based on data Ali Samad Tahir Rashid DDBMS Architecture 4 .Standardization Reference Model – A conceptual framework whose purpose is to divide standardization work into manageable pieces and to show at a general level how these pieces are related to one another.

each of which provides some functionality. The ISO/OSI architecture fall in this category. The system specifications within this category typically specify a hierarchical structure for the user classes.DBMS STANDARDIZATION • Based on components. The different classes of users are identified and the functions that the system will perform for each class are defined. A DBMS consists of a number of components. The components of the system are defined together with the interrelationships between components. • Based on functions. Ali Samad Tahir Rashid DDBMS Architecture 5 .

Ali Samad Tahir Rashid DDBMS Architecture 6 . This approach (also referred as the data logical approach) is claimed to be the preferable choice for standardization activities. The different types of data are identified.DBMS STANDARDIZATION • Based on data. and an architectural framework is specified which defines the functional units that will realize or use data according to these different views.

It recognizes three views of data: the external view. Ali Samad Tahir Rashid DDBMS Architecture 7 .DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE The ANSI / SPARC architecture is claimed to be based on the data organization. an appropriate schema definition is required. the internal view. that of the enterprise. which is that of the user. who might be a programmer. that of the system or machine. For each of these views. and the conceptual view.

DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE Ali Samad Tahir Rashid DDBMS Architecture 8 .

DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE • At the lowest level of the architecture is the internal view. It is the „real world” view of the enterprise being modeled in the database. which is an abstract definition of the database. • Between these two ends is the conceptual schema. which is concerned with how users view the database. which deals with the physical definition and organization of data. Ali Samad Tahir Rashid DDBMS Architecture 9 . • At the other extreme is the external view.

Conceptual Schema Definition RELATION EMP [ KEY = {ENO} ATTRIBUTES = { ENO : CHARACTER(9) ENAME : CHARACTER(15) TITLE : CHARACTER(10) } ] RELATION PAY [ KEY = {TITLE} ATTRIBUTES = { TITLE SAL : CHARACTER(10) : NUMERIC(6) } ] Ali Samad Tahir Rashid DDBMS Architecture 10 .

Conceptual Schema Definition RELATION PROJ [ KEY = {PNO} ATTRIBUTES = { PNO : CHARACTER(7) PNAME : CHARACTER(20) BUDGET: NUMERIC(7) } ] RELATION ASG [ KEY = {ENO.PNO} ATTRIBUTES = { ENO PNO RESP DUR : : : : CHARACTER(9) CHARACTER(7) CHARACTER(10) NUMERIC(3) } ] Ali Samad Tahir Rashid DDBMS Architecture 11 .

Internal Schema Definition RELATION EMP [ KEY = {ENO} ATTRIBUTES = { ENO ENAME TITLE : CHARACTER(9) : CHARACTER(15) : CHARACTER(10) } ] INTERNAL_REL EMPL [ INDEX ON E# CALL EMINX FIELD = { HEADER E# ENAME TIT : BYTE(1) : BYTE(9) : BYTE(15) : BYTE(10) ⇓ } ] Ali Samad Tahir Rashid DDBMS Architecture 12 .

BUDGET FROM PROJ Ali Samad Tahir Rashid DDBMS Architecture 13 .External View Definition – Example 1 Create a BUDGET view from the PROJ relation CREATE VIEW BUDGET(PNAME. BUD) AS SELECT PNAME.

External View Definition – Example 2 Create a Payroll view from relations EMP and TITLE_SALARY CREATE AS VIEW SELECT PAYROLL (ENO.ENO.TITLE = PAY. PAY EMP.TITLE FROM WHERE Ali Samad Tahir Rashid DDBMS Architecture 14 .PAY. ENAME. SAL) EMP.EMP.ENAME.SAL EMP.

DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE Ali Samad Tahir Rashid DDBMS Architecture 15 .

whereas the hexagons are administrative roles. • The major component that permits mapping between different data organizational views is the data dictionary / directory (depicted as a triangle). DDBMS Architecture 16 Ali Samad Tahir Rashid . program. whereas the „I”-shaped bars on them represent interfaces. • The enterprise administrator’s role is to prepare the conceptual schema definition. which is a meta-database. • The database administrator is responsible for defining the internal schema definition. command. • The application administrator is responsible for preparing the external schema for applications. and description flow. • The arrows indicate data.DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE • The square boxes represent processing functions.

DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE • Two more users: – Application programmer – System programmer • Two user classes: – Casual user • Retrieve database and possible update • Added in external schema – Novice user • Typically have no knowledge of data base • Example (banking machine) Ali Samad Tahir Rashid DDBMS Architecture 17 .

DDBMS Architecture 18 .ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs Ali Samad Tahir Rashid The systems are characterized with respect to: (1) the autonomy of the local systems. (3) their heterogeneity. (2) their distribution.

Architectural models for Distributed DBMSs Ali Samad Tahir Rashid DDBMS Architecture 19 .

Autonomy • Distribution of control (and not data) .the degree of independence – The local operations of the individual DBMSs are not affected by their participation in the multidatabase system – The manner in which individual DBMSs process queries and optimize them should not be affected by the execution of global queries – System consistency should not be compromised when individual DBMSs join or leave the multidatabase system Ali Samad Tahir Rashid DDBMS Architecture 20 .

Ali Samad Tahir Rashid DDBMS Architecture 21 . • Communication autonomy: Ability of a component DBMS to decide whether and how to communicate with other DBMSs. • Execution autonomy: Ability of a component DBMS to execute local operations in any manner it wants to.Autonomy • On the other hand specifies the dimension of autonomy as: • Design autonomy: Ability of a component DBMS to decide on issues related to its own design.

Ali Samad Tahir Rashid DDBMS Architecture 22 . – Total isolation – the individual systems are standalone DBMSs. which know neither of the existence of other DBMSs nor how to communicate with them. – Semiautonomous system – consist of DBMSs that can operate independently.Autonomy • Possibilities: – Tight integration – a single-image of the entire database is available to any user who wants to share the information. but have decided to participate in a federation to make their local data sharable. which may reside in multiple databases.

ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - DISTRIBUTION
Distributions refers to the distributions of data. Of course, we are considering the physical distribution of data over multiple sites; the user sees the data as one logical pool. Two alternatives: – client / server distribution – peer-to-peer distribution (full distribution)

Ali Samad Tahir Rashid

DDBMS Architecture

23

ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - DISTRIBUTION
Client / server distribution.
The client / server distribution concentrates data management duties at servers while the clients focus on providing the application environment including the user interface. The communication duties are shared between the client machines and servers. Client / server DBMSs represent the first attempt at distributing functionality.

Peer-to-peer distribution.
There is no distinction of client machines versus servers. Each machine has full DBMS functionality and can communicate with other machines to execute queries and transactions.

Ali Samad Tahir Rashid

DDBMS Architecture

24

ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - DISTRIBUTION
Client / server distribution.
The client / server distribution concentrates data management duties at servers while the clients focus on providing the application environment including the user interface. The communication duties are shared between the client machines and servers. Client / server DBMSs represent the first attempt at distributing functionality.

Peer-to-peer distribution.
There is no distinction of client machines versus servers. Each machine has full DBMS functionality and can communicate with other machines to execute queries and transactions.

Ali Samad Tahir Rashid

DDBMS Architecture

25

ranging form hardware heterogeneity and differences in networking protocols to variations in data managers. but also covers differences in languages even when the individual systems use the same data model.ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs . Ali Samad Tahir Rashid DDBMS Architecture 26 . Representing data with different modeling tools creates heterogeneity because of the inherent expressive powers and limitations of individual data models. Heterogeneity in query languages not only involves the use of completely different data access paradigms in different data models.HETEROGENEITY Heterogeneity may occur in various forms in distributed systems.

Ali Samad Tahir Rashid DDBMS Architecture 27 . query language. communications. transaction management algorithms • Representing data with different modeling tools creates heterogeneity because of the inherent expressive power and limitations of individual data models. operating system) • DBMS important one – data model.ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs HETEROGENEITY • Various levels (hardware.

Ali Samad Tahir Rashid DDBMS Architecture 28 .ALTERNATIVES The dimensions are identified as: A (autonomy).ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs . The alternatives along each dimension are identified by numbers as: 0. D (distribution) and H (heterogeneity). 1 or 2.

semiautonomous systems A2 .ALTERNATIVES A0 .ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs .total isolation H0 .no distribution D1 .tight integration A1 .peer-to-peer systems Ali Samad Tahir Rashid DDBMS Architecture 29 .client / server systems D2 .homogeneous systems H1 .heterogeneous systems D0 .

ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs Ali Samad Tahir Rashid DDBMS Architecture 30 .

Alternatives in Distributed Database Systems Distribution Peer-to-peer Distributed DBMS Distributed multi-DBMS Client/server Autonomy Multi-DBMS Federated DBMS Heterogeneity Tahir Rashid Ali Samad DDBMS Architecture 31 .

two alternative architectures that are focus of this book: • (A0.ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs . Ali Samad Tahir Rashid DDBMS Architecture 32 .ALTERNATIVES • In figure 4.3 . H1) • Not all the architectures that are identified by this design space are meaningful. H0) • (A2. D2. D2.

Ali Samad Tahir Rashid DDBMS Architecture 33 . D0.ALTERNATIVES (A0. H0) If there is no distribution or heterogeneity.ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs . D0. Such systems can be given generic name composite systems. the system is a set of multiple DBMSs that are logically integrated. one has multiple data managers that are heterogeneous but provide an integrated view to the user. (A0. (A0. H1) If heterogeneity is introduced. Not such examples but they may be suitable for shared everything multiprocessor systems. Mentioned earlier and will discuss further. H0) The more interesting case is where the database is distributed even though an integrated view of the data is provided to users (client / server distribution). D1.

H0) These are semiautonomous systems.ALTERNATIVES (A0. but their participation in the federation indicate that they are willing to cooperate with other in executing user requests that access multiple databases. An example may be multiple installations of an DBMS. (A1. Ali Samad Tahir Rashid DDBMS Architecture 34 . H0) The same type of transparency is provided to the user in a fully distributed environment. which are commonly termed federated DBMS. There is no distinction among clients and servers. D2. D0. each site providing identical functionality. The component systems in a federated environment have significant autonomy in their execution.ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs .

These are multidatabase systems (MDBS).ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs .ALTERNATIVES (A1. The components have no concept of cooperation. H1) System of this type introduce distribution by placing component systems on different machines. D1. H1) These are systems that introduce heterogeneity as well as autonomy. Ali Samad Tahir Rashid DDBMS Architecture 35 . (A1. D0. (A2. They may be referred to as distributed. what we might call a heterogeneous federated DBMS. heterogeneous federated DBMS. Without heterogeneity and distribution. H0) Now we have full autonomy. an MDBS is an interconnected collection of autonomous databases. D0.

we call this the distributed MDBS. D1. (A2. because of the similarity of the problem. maybe even more so than (A1. H1) These two cases are together. H1).ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs . in that we always want to built applications which access data from multiple storage systems with different characteristics.ALTERNATIVES (A2. D0. D0. They both represent the case where component databases that make up the MDBS are distributed over a number of sites . H1) and (A2. H1) These case is realistic. Ali Samad Tahir Rashid DDBMS Architecture 36 . D2.

LCSn LIS1 LIS2 LISn Ali Samad Tahir Rashid DDBMS Architecture 37 .....Data logical Distributed DBMS Architecture ES1 ES2 . ESn ES: External Schema GCS: Global Conceptual Schema LCS: Local Conceptual Schema LIS: Local Internal Schema GCS LCS1 LCS2 .. ..

GESn LES11 … LCS1 LES1n GCS LESn1 … LCSn LESnm LCS2 … … LIS1 LIS2 LISn Ali Samad Tahir Rashid • • GES: Global External Schema LES: Local External Schema • • LCS: Local Conceptual Schema LIS: Local Internal Schema 38 DDBMS Architecture ...Datalogical Multi-DBMS Architecture GES1 GES2 .

failure recovery Ali Samad Tahir Rashid DDBMS Architecture 39 . concurrency.Distributed DBMS • Distributed database requires distributed DBMS • Functions of a distributed DBMS: – Locate data with a distributed data dictionary – Determine location from which to retrieve data and process query components – DBMS translation between nodes with different local DBMSs (using middleware) – Data consistency (via multiphase commit protocols) – Global primary key control – Scalability – Security. query optimization.

Figure 13-10 – Distributed DBMS architecture Ali Samad Tahir Rashid DDBMS Architecture 40 .

Finds that it is local 3. Local DBMS processes request 5. Distributed DBMS checks distributed data repository for location of data.Local Transaction Steps 1. Application makes request to distributed DBMS 2. Distributed DBMS sends request to local DBMS 4. Local DBMS sends results to application Ali Samad Tahir Rashid DDBMS Architecture 41 .

) (showing local transaction steps) 2 1 3 5 4 Local transaction – all data stored locally Ali Samad Tahir Rashid DDBMS Architecture 42 .Figure 13-10: Distributed DBMS Architecture (cont.

Distributed DBMS checks distributed data repository for location of data. Distributed DBMS at remote site translates request for its local DBMS if necessary. Local DBMS at remote site sends results to distributed DBMS at remote site 7. Local DBMS at remote site processes request 6. Distributed DBMS at originating site sends results to application Ali Samad Tahir Rashid DDBMS Architecture 43 .Global Transaction Steps 1. Application makes request to distributed DBMS 2. Finds that it is remote 3. Distributed DBMS routes request to remote site 4. Remote distributed DBMS sends results back to originating site 8. and sends request to local DBMS 5.

) (showing global transaction steps) 2 1 8 3 7 4 5 Global transaction – some data is at remote site(s) 6 Ali Samad Tahir Rashid DDBMS Architecture 44 .Figure 13-10: Distributed DBMS architecture (cont.

Dx. Hy) Ali Samad Tahir Rashid DDBMS Architecture 45 .(A2. D1. H0) • Multidatabase systems .(Ax. Hy) • Distributed databases .DISTRIBUTED DBMS ARCHITECTURE • Client / server systems . D2.(A0.

The Client/Server Database Environment Ali Samad .

server is a database server Ali Samad Tahir Rashid DDBMS Architecture 47 .Client/Server Systems • Networked computing model • Processes distributed between clients and servers • Client – Workstation (usually a PC) that requests and uses a service • Server – Computer (PC/mini/mainframe) that provides a service • For DBMS.

programs DBMS activities • Storage Logic – Data storage/retrieval Ali Samad Tahir Rashid DDBMS Architecture 48 . functions.Application Logic in C/S Systems • Presentation Logic – Input – keyboard/mouse – Output – monitor/printer • Processing Logic – I/O processing – Business rules – Data management GUI Interface Procedures.

Client/Server Architectures • File Server Architecture • Database Server Architecture • Three-tier Architecture Ali Samad Tahir Rashid DDBMS Architecture 49 .

etc. Ali Samad Tahir Rashid DDBMS Architecture 50 . integrity checks. • Problems: – Huge amount of data transfer on the network – Each client must contain full DBMS • Heavy resource demand on clients • Client DBMSs must recognize shared locks.File Server Architecture • All processing is done at the PC that requested the data • Entire files are transferred from the server to the client for processing.

File Server Architecture Ali Samad Tahir Rashid DDBMS Architecture 51 .

Database Server Architectures • 2-tiered approach • Client is responsible for – I/O processing logic – Some business rules logic • Server performs all data storage and access processing  DBMS is only on server • Advantages – – – – Clients do not have to be as powerful Greatly reduces data traffic on the network Improved data integrity since it is all processed centrally Stored procedures  some business rules done on server Ali Samad Tahir Rashid DDBMS Architecture 52 .

Advantages of Stored Procedures • • • • Compiled SQL statements Reduced network traffic Improved security Improved data integrity Ali Samad Tahir Rashid DDBMS Architecture 53 .

Database server architecture DBMS only on server Ali Samad Tahir Rashid DDBMS Architecture 54 .

Three-Tier Architectures • Three layers: – Client – Application server Business rules – Database server Data storage  Thin  PC GUI interface (I/O processing) Browser Web Server DBMS Client just for user interface and a little application processing. Limited or no data storage (sometimes no hard drive) DDBMS Architecture 55 Ali Samad Tahir Rashid .

Three-tier architecture Thinnest clients Business rules on separate server DBMS only on DB server Ali Samad Tahir Rashid DDBMS Architecture 56 .

DISTRIBUTED DBMS ARCHITECTURE CLIENT / SERVER SYSTEMS • This provides two-level architecture which make it easier to manage the complexity of modern DBMSs and the complexity of distribution. transaction management. Ali Samad Tahir Rashid DDBMS Architecture 57 . • The server does most of the data management work (query processing and optimization. • The client is the application and the user interface (management the data that is cached to the client. storage management). management the transaction locks).

Ali Samad Tahir Rashid DDBMS Architecture 58 .DISTRIBUTED DBMS ARCHITECTURE CLIENT / SERVER SYSTEMS • This architecture is quite common in relational systems where the communication between the clients and the server(s) is at the level of SQL statements.

DISTRIBUTED DBMS ARCHITECTURE CLIENT / SERVER SYSTEMS Ali Samad Tahir Rashid DDBMS Architecture 59 .

this is not much different from centralized databases since the database is stored on only one machine (the server) which also hosts the software to manage it.multiple server In this case. Multiple client . Ali Samad Tahir Rashid DDBMS Architecture 60 .DISTRIBUTED DBMS ARCHITECTURE CLIENT / SERVER SYSTEMS Multiple client .single server From a data management perspective. there are some differences from centralized systems in the way transactions are executed and caches are managed. However. two alternative management strategies are possible: either each client manages its own connection to the appropriate server or each client knows of only its “home server” which then communicates with other servers as required.

Multiple Clients/Single Server Applications Client Services Communications Applications Client Services Communications Applications Client Services Communications High-level requests LAN Filtered data only Communications DBMS Services Database Ali Samad Tahir Rashid DDBMS Architecture 61 .

Task Distribution Application QL Interface … Programmatic Interface Communications Manager SQL query result table Query Optimizer Lock Manager Storage Manager Communications Manager Page & Cache Manager Database Ali Samad Tahir Rashid DDBMS Architecture 62 .

Advantages of Client-Server Architectures • More efficient division of labor • Better price/performance on client machines • Ability to use familiar tools on client machines • Client access to remote data (via standards) • Full DBMS functionality provided to client workstations • Overall better system price/performance Ali Samad Tahir Rashid DDBMS Architecture 63 .

Problems With MultipleClient/Single Server • Server forms bottleneck • Server forms single point of failure • Database scaling difficult Ali Samad Tahir Rashid DDBMS Architecture 64 .

Multiple Clients/Multiple Servers • • • • directory caching query decomposition commit protocols Applications Client Services Communications LAN Communications DBMS Services Communications DBMS Services Database Database 65 Ali Samad Tahir Rashid DDBMS Architecture .

Server-to-Server • SQL interface • programmatic interface • other application support environments Applications Client Services Communications LAN Communications DBMS Services Communications DBMS Services Database Database 66 Ali Samad Tahir Rashid DDBMS Architecture .

• Local conceptual schema (LCS) . • External schemas (ESs) .support user applications and user access to the database.describes the enterprise view of the data.describes the logical organization of data at each site.DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS • The physical data organization on each machine may be different. • Local internal scheme (LIS) .is an individual internal schema definition at each site. • Global conceptual schema (GCS) . Ali Samad Tahir Rashid DDBMS Architecture 67 .

DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS Ali Samad Tahir Rashid DDBMS Architecture 68 .

The local mappings are still performed by local directory / dictionary (LD/D). Tahir Rashid Ali Samad DDBMS Architecture 69 . The local database management components are integrated by means of global DBMS functions.DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS In these case. Local conceptual schemas are mappings of global schema onto each site. the ANSI/SPARC model is extended by the addition of global directory / dictionary (GD/D) to permits the required global mappings.

DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS Ali Samad Tahir Rashid DDBMS Architecture 70 .

Two major components: – user processor – data processor Ali Samad Tahir Rashid DDBMS Architecture 71 .DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS The detailed components of a distributed DBMS.

Ali Samad Tahir Rashid DDBMS Architecture 72 .

and translates the global queries in local ones using the global and local conceptual schemas as well as global directory.uses the integrity constraints and authorizations that are defined as part of the global conceptual schema to check if the user query can be processed. • global query optimizer and decomposer . Ali Samad Tahir Rashid DDBMS Architecture 73 . • distributed execution monitor . • semantic data controller . and formatting the result data as it is sent to the user.determines an execution strategy to minimize a cost function.is responsible for interpreting user commands as they come in.DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS User processor • user interface handler .coordinates the distributed execution of the user request.

is responsible for making sure that the local database remains consistent even when failures occur. • run-time support processor .physically accesses the database according to the physical commands in the schedule generated by the query optimizer. which is responsible for maintaining the main memory buffers and managing the data accesses. Ali Samad Tahir Rashid DDBMS Architecture 74 . • local recovery manager .is responsible for choosing the best access path to access any data item.DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS Data processor • local query optimizer . This is the interface to the operating system and contains the database buffer (or cache) manager.

Peer-to-Peer Component Architecture USER PROCESSOR External Schema User Interface Handler DATA PROCESSOR System Local Conceptual Log Schema Local Recovery Manager Global Conceptual Schema Global Query Optimizer GD/D Local Internal Schema Semantic Data Controller Local Query Processor User requests USER Global Execution Monitor System responses Ali Samad Tahir Rashid DDBMS Architecture Runtime Support Processor Database 75 .

Sign up to vote on this title
UsefulNot useful