following advantages: • Transparency • Higher reliability • Improved performance • Easier system expansion Transparency
• The distributed systems should be perceived
as a single entity by the users or the application programmers rather than as a collection of autonomous systems, which are cooperating. The users should be unaware of where the services are located and also the transferring from a local machine to a remote one should also be transparent. Types of Transparency • Access Transparency • Location Transparency • Concurrency Transparency • Replication Transparency • Failure Transparency • Migration Transparency • Performance Transparency • Scaling Transparency Access Transparency
• Clients should be unaware of the distribution
of the files. The files could be present on a totally different set of servers which are physically distant apart and a single set of operations should be provided to access these remote as well as the local files. • Example: Network File System (NFS), SQL queries, and Navigation of the web Location Transparency • Enables resources to be accessed without knowledge of their physical or network location (for example, which building or IP address). • Files or group of files can be relocated without changing their path names. • As an example of a lack of access transparency, consider a distributed system that does not allow you to access files on a remote computer unless you make use of the ftp program to do so. Concurrency Transparency • Users and Applications should be able to access shared data or objects without interference between each other. This requires very complex mechanisms in a distributed system. The shared objects are accessed simultaneously. Replication Transparency • This kind of transparency should be mainly incorporated for the distributed file systems, which replicate the data at two or more sites for more reliability. The client generally should not be aware that a replicated copy of the data exists. FAILURE TRANSPERANCY • Enables the concealment of faults, allowing users and application programs to complete their tasks despite the failure of hardware or software components. • The distributed system are more liable to failures as any of the component may fail which may lead to degraded service or the total absence of that service. Mobility transparency • allows the movement of resources(i.e. information or processes) and clients within a system without affecting the operation of users or programs. Performance transparency • allows the system to be reconfigured to improve performance • Scaling transparency allows the system and applications to expand in scale without change to the system structure or the application algorithms. e as loads vary. Higher reliability • Replication of components • No single points of failure e.g., a broken communication link or processing element does not bring down the entire system • Distributed transaction processing guarantees the consistency of the database and concurrency Improved performance • Improved performance through database replication (i.e. storing database objects in multiple databases which increase availability and performance). • Reduces remote access delays Improved performance • Distributed systems provide a high degree of availability in the face of hardware faults. The availability of a system is a measure of the proportion of time that it is available for use. When one of the components in a distributed system fails, only the work that was using the failed component is affected. A user may move to another computer if the one that they were using fails; a server process can be started on another computer. Easier system expansion • Issue is database scaling • Emergence of microprocessor and workstation technologies – Network of workstations much cheaper than a single mainframe computer • Data communication cost versus telecommunication cost • Increasing database size DDBMS ISSUES • Complexity • Cost • Security • Integrity control more difficult • Lack of standards • Lack of experience • Database design more complex Technical Problems Distributed database design How to fragment the data? Partitioned data vs. replicated data? Distributed query processing Design algorithms that analyze queries and convert them into a series of data manipulation operations – Distribution of data, communication costs, etc. has to be considered Find optimal query plans Distributed concurrency control Synchronization of concurrent accesses such that the integrity of the DB is maintained – Integrity of multiple copies of (parts of) the DB have to be considered (mutual consistency) Reliability – How to make the system resilient to failures – Atomicity and Durability Heterogeneous databases – If there is no homogeneity among the DBs at various sites either in terms of the way data is logically structured (data model) or in terms of the access mechanisms (language), it becomes necessary to provide translation mechanisms