Introduces the architecture of different distributed
systems such as client/server system and peer-to-peer distributed system. Owing to the diversity of distributed systems, it is very difficult to generalize the architecture of distributed DBMSs. Different alternative architectures of the distributed database systems and the advantages and disadvantages of each system are discussed in detail The architecture of a system reflects the structure of the underlying system. It defines the different components of the system, the functions of these components and the overall interactions and relationships between these components. Client/Server System The client/server system is a versatile, message-based and modular infrastructure that is intended to improve usability, flexibility, interoperability and scalability as compared to centralized, mainframe, time-sharing computing. A Client is an individual user’s computer or a user application that does a certain amount of processing on its own and sends and receives requests to and from one or more servers for other processing and/or data. A Server consists of one or more computers or an application program that receives and processes requests from one or more client machines. A server is typically designed with some redundancies in power, Advantages of Client/Server System A client/server system provides a number of advantages. The major advantage is that it improves usability, flexibility, interoperability and scalability as compared to centralized, time-sharing, mainframe computing. A client/server system has the ability to distribute the computing workload between client workstations and shared servers. A client/server system allows the end user to use a microcomputer’s graphical user interfaces, thereby improving functionality and simplicity. It provides better performance at a reduced cost for Disadvantages of Client/Server System The client/server architecture creates a more complex environment in which it is often difficult to manage different platforms (LANs, operating systems, DBMS etc.). In a client/server system, the operating system software is distributed over many machines rather than a single system, thereby increasing complexity. A client/server system may suffer from security problems as the number of users and processing sites increases. The workstations are geographically distributed in a client/server system, and each of these workstations is administrated and controlled by individual departments, which adds extra complexity. Furthermore, communication cost is incurred with each processing. Architecture of Client/Server Distributed Systems The Client/Server architecture is based on hardware and software components that interact to form a distributed system. In a client/server distributed database system, entire data can be viewed as a single logical database while at the physical level data may be distributed. From the data organizational view, the architecture of a client/server distributed database system is mainly concentrated on software components of the system. It includes three main components: clients servers and communications middleware. Client/Server Distributed Systems Architecture A Client is an individual computer or process or user’s application that requests services from the server. A Client is also known as front-end application, as the end user usually interacts with the client process. The software components required in the client machine are the client operating system, client DBMS and client graphical user interface. Client process runs on an operating system that has at least some multi-tasking capabilities The end users interact with the client process via a graphical user interface. In addition, a client DBMS is required at the client side, which is responsible for managing the data that is cached in the client. In some client/server architectures, communication software is embedded into the client machine, as a substitute for communication middleware, to interact efficiently with other machines in the network. A Server consists of one or more computers or is a computer process or application that provides services to clients. A Server is also known as back-end application, as the server process provides the background services for the client processes. A server provides most of the data management services such as query processing and optimization, transaction management, recovery management, storage management and integrity maintenance services to the clients. sometimes communication software is embedded into the server machine, instead of communication Communication middleware is any process(es) through which clients and servers communicate with each other. The communication middleware is usually associated with a network that controls data and information transmission between clients and servers. Communication middleware software consists of three main components: application program interface (API), database translator and network translator. The API is public to client applications through which they can communicate with the communication middleware. Architectural Alternatives for Client/Server Systems A client/server system can have several architectural alternatives known as two-tier, three-tier and multi-tier or n-tier. Three-tier architecture –. Some networks of client/server architecture consist of three different kinds of nodes: clients, application servers, which process data for the clients, and database servers, which store data for the application servers. Peer-to-Peer Distributed System The peer-to-peer architecture is a good way to structure a distributed system so that it consists of many identical software processes or modules, each module running on a different computer or node. The different software modules stored at different sites communicate with each other to complete the processing required for the execution of distributed applications. Peer-to-peer architecture provides both client and server functionalities on each computer. Therefore, each node can access services from other nodes as well as providing services to other nodes. Reference Architecture of Distributed DBMSs A set of global external schemas A global conceptual schema (GCS) A fragmentation schema and allocation schema A set of schemas for each local DBMS, conforming to the ANSI–SPARC three-level architecture. Global external schema –. In a distributed system, user applications and user accesses to the distributed database are represented by a number of global external schemas. This is the topmost level in the reference architecture of a distributed DBMS. Global conceptual schema –. The GCS represents the logical description of the entire database as if it is not distributed. This level corresponds to the conceptual level of the ANSI–SPARC architecture of centralized DBMS and contains definitions of all entities, relationships among entities and security and integrity information for the whole database stored at all sites in a distributed system. Fragmentation schema and allocation schema –. In a distributed database, the data can be split into a number of non-overlapping portions, called fragments. There are several different ways to perform this fragmentation operation. The fragmentation schema describes how the data is to be logically partitioned in a distributed database. Multi-Database System (MDBS) An MDBS is a software that can be manipulated and accessed through a single manipulation language with a single common data model (i.e., through a single application) in a heterogeneous environment without interfering with the normal execution of the individual database systems. The MDBS has developed from a requirement to manage and retrieve data from multiple databases within a single application while providing complete autonomy to individual database systems. Taxonomy of Multi-database Systems Federated MDBS (FMDBS). It is a collection of cooperating database management systems that are autonomous but participate in a federation to allow partial and controlled sharing of their data. In a federated MDBS, all component DBMSs cooperate to allow different degrees of integration. There is no centralized control in a federated architecture because the component databases control access to their data. An FMDBS is a cross between a DDBMS and a centralized DBMS. It is a distributed system to global users whereas a centralized DBMS to local users. In a simple way, an MDBS is said to be an FMDBS, if users interface to the MDBS through some integrated views, and there is no connection between any two integrated views. The features of an FMDBS Integrated schema exists –. The FMDBS administrator (MDBA) is responsible for the creation of integrated schemas in a heterogeneous environment.
Component databases are transparent to users
–. Users are not aware of the multiple component DBMSs in an FMDBS; thus, the users only need to understand the integrated schemas to implement the operations on an FMDBS. They cannot change the integrated components when they operate this FMDBS. A common data model (CDM) is required to implement the federation –. The CDM must be very powerful to represent all data models in the different components. The integration of export schemas of component data models is placed on the CDM. Update transactions are a difficult issue in FMDBS –. The component databases are completely independent and join the federation through the integrated schema. It is difficult to decide whether the FMDBS or the local component database systems will control the transactions. Five-Level Schema Architecture of federated MDBS REFERENCE ARCHITECTURE OF TIGHTLY COUPLED FEDERATED MDBS REFERENCE ARCHITECTURE OF LOOSELY COUPLED FEDERATED MDBS