INTRODUCTION:In the term distributed computing, the word distributed means spread out across space.
Thus, distributed computing is an activity performed on a spatially distributed system. A distributed system consists of collection of autonomous computers, connected through a network and distributed operating system software, which enables computers to coordinate their activities and to share the resources of the system - hardware, software and data, so that users perceive the system as a single, integrated computing facility.
(Figure 1-Distributed Computing) These networked computers may be in the same room, same campus, same country, or in different continents. A distributed system may have a common goal, such as solving a large computational problem. Alternatively, each computer may have its own user with
individual needs, and the purpose of the distributed system is to coordinate the use of shared resources or provide communication services to the users.
Rise of Distributed Computing: Computer hardware prices are falling and power increasing. Network connectivity is increasing. Everyone is connected with fat pipes. It is easy to connect hardware together. Combination of cheap processors often more Cost-effective than one expensive fast system. Flexibility to add according to needs. Potential increase of reliability. Sharing of resources.
Characteristics of Distributed Computing:Six key characteristics are primarily responsible for the usefulness of distributed system. They are resource sharing, openness, concurrency, scalability, fault tolerance and transparency. It should be emphasized that they are not automatic consequences of distribution; system must be carefully designed in order to ensure that they are achieved.
Resource Sharing:Resource sharing is the ability to use any hardware, software or data anywhere in
the system. Resources in a distributed system, unlike the centralized one, are physically encapsulated within one of the computers and can only be accessed from others by communication. It is the resource manager to offers a communication interface enabling the resource be accessed, manipulated and updated reliability and consistently. There are mainly two kinds of model resource managers: client/server model and the object-based model. Object Management Group uses the latter one in CORBA, in which any resource
is treated as an object that encapsulates the resource by means of operations that users can invoke.
Openness:Openness is concerned with extensions and improvements of distributed systems.
New components have to be integrated with existing components so that the added functionality becomes accessible from the distributed system as a whole. Hence, the static and dynamic properties of services provided by components have to be published in detailed interfaces.
Concurrency:Concurrency arises naturally in distributed systems from the separate activities
of users, the independence of resources and the location of server processes in separate computers. Components in distributed systems are executed in concurrent processes. These processes may access the same resource concurrently. Thus the server process must coordinate their actions to ensure system integrity and data integrity.
Scalability:Scalability concerns the ease of the increasing the scale of the system (e.g. the
number of processor) so as to accommodate more users and/or to improve the corresponding responsiveness of the system. Ideally, components should not need to be changed when the scale of a system increases.
Fault tolerance:Fault tolerance cares the reliability of the system so that in case of failure of
hardware, software or network, the system continues to operate properly, without significantly degrading the performance of the system. It may be achieved by recovery (software) and redundancy (both software and hardware).
Transparency:Transparency hides the complexity of the distributed systems to the users and
application programmers. They can perceive it as a whole rather than a collection of cooperating components in order to reduce the difficulties in design and in operation. This characteristic is orthogonal to the others. There are many aspects of transparency, including access transparency, location transparency, concurrency transparency, replication transparency, failure transparency, migration transparency, performance transparency and scaling transparency.
Distributed Computing Architecture:Various hardware and software architectures are used for distributed computing. At a lower level, it is necessary to interconnect multiple CPUs with some sort of network, regardless of whether that network is printed onto a circuit board or made up of looselycoupled devices and cables. At a higher level, it is necessary to interconnect processes running on those CPUs with some sort of communication system. Distributed programming typically falls into one of several basic architectures or categories: Client-server, 3-tier architecture, N-tier architecture, Distributed objects, loose coupling, or tight coupling.
Client-server:Smart client code contacts the server for data, then formats and displays it to the user. Input at the client is committed back to the server when it represents a permanent change.
3-tier architecture :Three tier systems move the client intelligence to a middle tier so that stateless clients can be used. This simplifies application deployment. Most web applications are 3-Tier.
N-tier architecture:N-Tier refers typically to web applications which further forward their requests to other enterprise services. This type of application is the one most responsible for the success of application servers.
Tightly coupled (clustered):Tightly coupled architecture refers typically to a cluster of machines that closely work together, running a shared process in parallel. The task is subdivided in parts that are made individually by each one and then put back together to make the final result.
Peer-to-peer:Peer-to-peer is an architecture where there is no special machine or machines that provide a service or manage the network resources. Instead all responsibilities are uniformly divided among all machines, known as peers. Peers can serve both as clients and servers.
Space based :Space based refers to an infrastructure that creates the illusion (virtualization) of one single address-space. Data are transparently replicated according to application needs. Decoupling in time, space and reference is achieved.
Another basic aspect of distributed computing architecture is the method of communicating and coordinating work among concurrent processes. Through various message passing protocols, processes may communicate directly with one another, typically in a master/slave relationship. Alternatively, a "database-centric" architecture can enable distributed computing to be done without any form of direct interprocess communication, by utilizing a shared database.
Distributed Computing Paradigms:The Message Passing Paradigm:Message passing is the most fundamental paradigm for distributed applications. A process sends a message representing a request. The message is delivered to a receiver, which processes the request, and sends a message in response. In turn, the reply may trigger a further request, which leads to a subsequent reply, and so forth.
The Client-Server Paradigm:Perhaps the best known paradigm for network applications, the client-server
model assigns asymmetric roles to two collaborating processes. One process, the server, plays the role of a service provider which waits passively for the arrival of requests. The other, the client, issues specific requests to the server and awaits its response. Simple in concept, the client-server model provides an efficient abstraction for the delivery of network services. Operations required include those for a server process to listen and to accept requests, and for a client process to issue requests and accept responses. By assigning asymmetric roles to the two sides, event synchronization is simplified: the server process waits for requests, and the client in turn waits for responses. Many Internet services are client-server applications. These services are often known by the protocol that the application implements. Well known Internet services include HTTP, FTP, DNS, etc.
The Peer-to-Peer Distributed Computing Paradigm:In the peer-to-peer paradigm, the participating processes play equal roles, with
equivalent capabilities and responsibilities (hence the term “peer”). Each participant may issue a request to another participant and receive a response. The peer-to-peer paradigm is more appropriate for applications such as instant messaging, peer-to-peer file transfers, video conferencing, and collaborative work. It is also possible for an application to be based on both the client-server model and the peer-to-peer model. A well-known example of a peer-to-peer file transfer service is Napster.com or similar sites which allow files (primarily audio files) to be transmitted among computers on the Internet. It makes use of a server for directory in addition to the peer-to-peer computing.
Application:There are many examples of commercial application of distributed system, such as the Database Management System, distributed computing using mobile agents, local intranet, internet (World Wide Web), JAVA RMI, etc.
Distributed Computing Using Mobile Agents:Mobile agents can be wandering around in a network using free resources for
their own computations.
Local Intranet:A portion of Internet that is separately administered & supports internal sharing of resources (file/storage systems and printers) is called local intranet.
Internet:The Internet is a global system of interconnected computer networks that use the standardized Internet Protocol Suite (TCP/IP).
Communicating Entities: Implementing some application for user Using support of distributed services Layers of support Client/server Embedded in language Java: Object variant of remote procedure call Adds naming compared with RPC Restricted to Java environments
RMI Features: Distributed object model: Objects: normal and remote Idea: Remote object exists on other host Remote object can be used as normal object Behavior described by interface Environment takes care of remote invocation Differences normal and remote objects: Remote references can be distributed freely Clients only know/use interface, not actual implementation Passing remote objects by reference, normal objects by copying Failure handling more complicated since invocation itself can also fail
Economics:Computers harnessed together give a better price/performance ratio than mainframes.
Speed:A distributed system may have more total computing power than a mainframe.
Inherent distribution of applications:Some applications are inherently distributed. E.g., an ATM-banking application.
Reliability:If one machine crashes, the system as a whole can still survive if you have
multiple server machines and multiple storage devices (redundancy).
Extensibility and Incremental Growth:Possible to gradually scale up (in terms of processing power and functionality)
by adding more sources (both hardware and software). This can be done without disruption to the rest of the system.
Distributed custodianship:The National Spatial Data Infrastructure (NSDI) calls for a system of partnerships to produce a future national framework for data as a patchwork quilt of information collected at different scales and produced and maintained by different governments and agencies. NSDI will require novel arrangements for framework management, area integration, and data distribution. This research will examine the basic feasibility and likely effects of such distributed custodianship in the context of distributed computing architectures, and will determine the institutional structures that must evolve to support such custodianship.
Data integration:This research will contribute to the integration of geographic information and GISs into the mainstream of future libraries, which are likely to have full digital capacity. The digital libraries of the future will offer services for manipulating and processing data as well as for simple searches and retrieval.
By anticipating the impact that a rapidly advancing technology will have on GISs, this research will allow the GIS community to take better advantage of the opportunities that the technology offers.
Disadvantages: Lack of experience in designing, and implementing a distributed system. E.g. which platform (hardware and OS) to use, which language to use etc. But this is changing now. If the network underlying a distributed system saturates or goes down, then the distributed system will be effectively disabled thus negating most of the advantages of the distributed system. Security is a major hazard since easy access to data means easy access to secret data as well.
Conclusions:In this age of optimization everybody is trying to get optimized output from their limited resources. The concept of distributed computing is the most efficient way to achieve the optimization. In case of distributed computing the actual task is modularized and is distributed among various computer system. It not only increases the efficiency of the task but also reduce the total time required to complete the task. Now the advance concept of this distributed computing, that is the distributed computing through mobile agents is setting a new landmark in this technology. A mobile agent is a process that can transport its state from one environment to another, with its data intact, and be capable of performing appropriately in the new environment.
Andrews, Gregory R. (2000), Foundations of Multithreaded, Parallel, and Distributed Programming, Addison–Wesley, ISBN 0-201-35752-6. Arora, Sanjeev; Barak, Boaz (2009), Computational Complexity – A Modern Approach, Cambridge, ISBN 978-0-521-42426-4. Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L. (1990), Introduction to Algorithms (1st ed.), MIT Press, ISBN 0-262-03141-8. Dolev, Shlomi (2000), Self-Stabilization, MIT Press, ISBN 0-262-04178-2.