You are on page 1of 48

344 - Introduction - Spring Operating System

Welcome back to the next module of the advanced operating systems course.
Recall that the Cornell experiment that we saw as the last piece of the previous module argues for a
component based design to reduce the pain points in the development of complex software systems and
industries that are designing and commercializing production operating systems and distributed services
through the client-server paradigm
There is another important pain point and that is how to design for the continuous and incremental
evolution of complex distributed software systems, both in terms of functionality and performance.
The short answer to the puzzle is distributed object technology.
We saw how object technology is employed in the Tornado parallel operating system as a structuring
tool to allow the scalability of operating system services in a parallel system. In this module of the
advanced operating systems course, we are going to see examples of how distributed object technology
is influencing commercial offerings in the computer industry.
We'll start this lesson module with the discussion of the Spring system, which was designed and
implemented in Sun Micro Systems as a network operating system for use in a local area network. Later
on, Spring was marketed as Sun's Solaris operating system.
Before we discuss the Spring system, a little bit of history and some personal connection. Yousef
Khalidi, one of the chief architects of the spring system, got his PhD from Georgia Tech in 1989
developing the cloud's distributing operating system, which is an object based operating system. And he
was my number uno PhD student. Incidentally, not surprisingly, the Spring system was heavily
influenced by Yousef's work with clouds and Spring came out commercially as Sun's Solaris MC product.
And for the trivia buffs out there, Yousef is now heading Microsoft's Azure Cloud Computing product.
By the way, Azure has nothing to do with the cloud system that Yousef developed as a grad student at
Georgia Tech.
Later on, when we discuss giant scale services and cloud computing, we will feature an interview with
Yousef wherein he shares his thoughts on future evolution of distributed system services.
345 - How to Innovate OS

Now back to our discussion of the Spring system at Sun. There is always a quadrium of how to innovate
in the operating system. Academia is ripe for pursuing ideas that are on the lunatic French but, if you are
in industry, you are always worried about, should we do a brand new operating system? Or do a better
implementation of a known operating system.

Research in industry is usually constrained by the market place that it serves, specifically if you're a
company like Sun Microsystems which in its heydays between 1980 and the 2005, was making Unix
workstations and building large complex server systems which run 24/7 for applications such as airline
reservation. If you are in that marketplace, marketplace demand says that, well, there are legacy
applications that are running on your current operating system and therefore building a brand new
operating system may not be that viable in an industrial setting.
So the approach they took in the Spring system at Sun Microsystems, is to be different but innovate
where it makes sense. It is a sort of like the Intel inside slogan, where in processor architecture Intel is
dominant and a lot of interesting computer architecture research happens in innovating under the covers
in the micro architecture. So the external interface is still well known interface like the Intel processor,
but underneath they do a lot of innovation in the micro architecture.
In a similar manner, if you are a company like Sun Microsystems that peddles Unix boxes and you
want to retain your customer base, then you want to make sure that the external interface remains as
UNIX. But under the covers, you innovate where it makes sense. In particular, you want to make sure
that everything that you do in the OS allows third party vendors to develop software against the new
APIs that you may provide in the OS and integrate that into OS and at the same time, make sure that such
integration is not going to break anything. Or said differently, you want to preserve all the things that are
good in standard operating system, but at the same time you want to make sure that the innovation allows
extensibility, flexibility and so on. That's the approach that Spring system took using object orientation
to do innovation under the covers, while keeping the external interface the same.
346 - Object based vs Procedural Design

That brings us to a discussion of procedural design versus an object-based design. You're all familiar,
I'm sure, with procedural design where you're writing your code as one monolithic entity. In a procedural
world, you have shared state and private state in the caller and the callee. And state is now distributed all
over the place. The interface between the caller and the callee is through the normal procedure call
mechanism, i.e. one sub-system may make a procedure call that goes into another subsystem. And this
is how monolithic kernels are built where state is now strewn all over the place.

In an object-based design, objects contain the state that is entirely contained within this object, not
visible outside. There are methods inside those object that manipulate the states. So in other words,
externally, the state is not visible and the only thing that is visible are the methods for invocation. Those
methods work on the state that is local to the object. So what you get with an object-based design is
strong interfaces and complete isolation of the state of an object from everything else.
Contrast that with the procedural design, where the state can be strewn all over the place, and the
shared state can be manipulated from several different subsystems that are part of a big monolith.
As OS designers, the immediate question is, if we have these strong interfaces, it sounds similar to
border-crossing between protection domains. Is it going to cost us? There are ways around it to make
these border crossing performance conscious as well.
Now, where to apply this object orientation? In Spring, they applied object orientation in building the
OS kernel. So the key take-away is, if object orientation is good at the level of implementing a high
performance OS kernel (we have already seen this when we talked about Tornado system), it should be
good at higher levels of the software, too.
347 - Spring Approach

The Spring Approach to building OS is to build strong interfaces for each sub-system.
• What that means is, the only thing that is exposed outside a sub-system is what services are
provided by the sub-system but not how. In other words, the implementation can be changed at
any time, as long as the external interface remains unchanged.

• They also wanted to make sure that the system is open and flexible. This is important to allow
integrating third party software into your OS.

• At the same time, you want to maintain the integrity of your subsystems, and that's why strong
interfaces are extremely important.

• Being open and flexible also suggests that you don't want everything to be written in one language.
This is the reason that in Spring they chose to use IDL (Interface Definition Language) from the
OMG group.
There are IDL compilers that are available from several third party software vendors, and what that
allows you to do is, you can define your interfaces using IDL. And third party software vendors can use
that IDL definition of the interfaces and use them in building their own subsystems that can be integrated
with the Spring system.
Spring approach’s extensibility and extensibility naturally leads to microkernel based approach and
that's what you see here. This is the structure of the Spring system and below this red line is Spring's idea
of a microkernel and in fact there are two parts to it.
There is a nucleus, which in Spring is the entity that provides the abstractions of threads and IPC
among the threads. And the kernel itself is made up of nucleus plus the virtual memory manager. So if
you have put these two things together, the nucleus gives you threads, IPC and the VM manager gives
you memory management. And if you remember back to our good old friend Liedtke's principle of what
a microkernel should provide. So what is below this red line is exactly Liedtke's principle that is the
microkernel is providing the abstraction of threads in IPC and an abstraction of memory.

All the things that are above the red line are outside the kernel.
In particular, I mention that Spring is Sun Microsystems' answer to building a network OS. Because
this is a time when transitioning was happening to services that are being provided on the network. They
wanted to go from an OS that runs on a single node to a network OS using the same interface, namely
the Unix interface. So this entity that you see here, which is called the network proxy, allows machines
to be connected to one another.
All the ovals that you're seeing that are outside the kernel provide different services that you might
need in your desktop environment. For instance X11 server is a display manager and you may need
ability to do shell level programming, and you need file system, and you need a way by which you can
communicate in the network, meaning that you need a protocol stack.
348 - Nucleus Microkernel of Spring

Nucleus is Spring OS’s microkernel and it is subset of Liedtke's description in the sense that, Nucleus
manages only threads and IPC.

The abstractions available in nucleus are the following.


• There is this domain. A domain is similar to Unix process, it's a container, or an address space,
and threads can execute in a particular domain. These threads are similar in semantics to pthreads.
• This abstraction called door, is a software capability to a domain. Think of it as a real life analogy
of opening a door in order to get into a room. In a similar manner, if you have a handle to the door
you can open the door and enter a target domain.
Any domain can create these nucleus entities called doors, which are essentially entry points for
entering the target domain. With the object orientation, I told you that the only thing that you can do is
make invocations on objects, The available entry points to the objects are represented by this door
abstraction.
Let's say, I'm a file server. What will I do? Well I have entry points in my file server, such as, opening
a file, or reading a file, writing a file, and so on. Basically, I will create those entry points as doors into
my domain.
And if I'm a client, how do I get access to the entry point that's available in the target domain?
• Well, the way I do that is exactly similar to how you may be opening your file in a Unix file
system. What you do is an fopen(). When you do that, you get a file descriptor, which is a small
integer that is a handle for you to access that file.
• In a similar manner, if I'm a client and if I want to access a particular entry point, then what I want
is an access to this door and the way I get that is by getting a door handle.
• Every domain will have this door table, which is similar to the file descriptors that you may have
in a Unix process. Every door ID that you have in this door table points to a particular door.
• If I have a door handle in my door table for a particular door, what that tells me is that, oh, I have
the ability to make an invocation in the target domain that this particular door corresponds to.
• The possessor of a door handle is able to make object invocations on the target domain using this
door handle.
A particular client domain can have a door table that has access to several different target domains.
And multiple clients may have access to the same door. For instance, if it's a file system, you may be
able to access the file system and I may be able to access the file system, too. The door table is something
that is unique to every domain and it gives that domain an ability to access the entry points in the target
domain.
The way to think about this door, it's basically a software capability to a domain. Since we are using
object orientation, it is represented by a pointer to a C++ object that represents the target domain. Door
can be passed from domain to domain and when it is passed from domain to domain it gives the ability
for those domains to get access to the entry points specified by the door to the target domain.
The Spring OS kernel itself is a composition of the Nucleus plus the memory management. That is
inherent in the fact that these domains represent an address space. Now, how do you go about making an
object invocation, i.e. making a protected procedure call into a target domain from a client domain?
Well, the Nucleus is involved in every door call.
• When I make the invocation using the small descriptor that I have, which is a door handle, the
nucleus looks at it, says okay this domain has the ability to do this invocation.
• Then the nucleus allocates a server thread on the target domain, and executes the invocation that
is indicated by this particular door handle.
• It's a protected procedure call, and since it is procedure call semantics, the client thread is
deactivated, and the thread is allocated to the target domain, so that it can execute the invocation
for the method that is indicated by this door handle.
• On return from this target domain, once that protected procedure call is completed, the thread is
deactivated.
• The client thread is reactivated so that the client can continue with whatever it was doing before.
So, this is very similar to the communication mechanism that we discussed in the lightweight RPC
paper before, in the sense that, we're doing very fast cross address space calls using this door mechanism.
This protected procedure call is in illustration of how nucleus makes sure that even though it has an
object based design in the structuring of the OS kernel, it ensures that it'll still be performant, in the sense
that you can do this cross domain calls very quickly through this idea of deactivating the client thread
and quickly activating the thread to execute the entry point procedure in the target domain and on return
reactivating the client thread.
All of this results in very fast cross address space calls through this door mechanism. That's how you
make sure that you get all the good attributes of object orientation and not sacrifice on performance at
the same time.
349 - Object Invocation Across the Network

As I mentioned, Spring is a network operating system and what I described to you just now is how object
invocation works within a single node.
Next we need to be able to do object invocation across the network over different nodes. Object
invocation between client and server across the network is extended using network proxies.
• For example, on the client box there is this Proxy B and on the server box, there is the Proxy A.

• Proxies can be potentially different for connection to different servers. So, this client may talk to
this server using this proxy. And may talk to a different server using a completely different proxy.

• In other words, the proxies can potentially employ different protocols. That's where you have
the opportunity to specialize.

• Depending whether the communication between the client and server is on the LAN or on WAN,
and so on, you can employ the appropriate protocol. This is a key property of building a network
OS in Sun where they wanted to make sure that decisions are not being ingrained in the OS of
a single node.

• In terms of the connectivity of that node to other nodes on the network. Depending on the location
of servers, you can employ different protocols to talk between the proxies that are on the client
machine and the server machine.

• The proxies are invisible to the client and the server. In other words, the client and the servers are
unaware whether they are both on the same machine or on a different machine, and they don't
care.
Let's see how this client-server relationship is established using these proxies.
When a client-server connection has to be made across the network
• The first thing that happens is, you instantiate a proxy on the server node and establish a door for
communication between the Proxy A and the server domain through the nucleus on the server
machine.
• Next Proxy A is going to export a network handle embedding this Door X to its peer Proxy B
that is on the client domain.
• The interaction between Proxy A and Proxy B is outside of the nucleus. The network handle has
nothing to do with the primitives or the mechanism that are available in the nucleus of the Spring
OS.
• On the client machine, Proxy B has a door Y that it has established locally on Nucleus B so that
the client domain can communicate with it.
• Proxy B will use the network handle that has been exported by Proxy A to establish a connection
between the two nuclei. So network handle and the communication between these two guys is not
through the nucleus. That's important for you to understand.
So now, how does the client make an invocation on the server domain?
• When the client wants to make an invocation, it will just access Door Y and thinks that it is
accessing the server's domain.
• When this invocation happens, Proxy B will communicate through this network handle that it has
with Proxy A.
• When Proxy A gets this client invocation through Proxy B, Proxy A will know that the invocation
is intended for the server domain.
• Proxy A knows how to access the server domain since it has Door X. So Proxy A uses the door
to make the actual invocation.
So to recap,
• The client wants to open this Door X. It doesn't have a direct handle on Door X because server
domain is in a different node of the network.
• Therefore, the way remote invocation is accomplished by passing the server domain's door (the
entry point into the server domain) on Proxy A via a network handle to peer proxy B on a different
node, in this case the client node.
• Once the network handle is available to Proxy B, it can establish the connection between these
nuclei.
• Once the connection is established, the client domain’s invocation call for Door X will be passed
through Door Y to Proxy B. Then Proxy B uses a network handle to communicate that invocation
over to Proxy A, which then uses the actual Door X that will open the invocation call under server
domain and execute the client domain's call.
350 - Secure Object Invocation

It may often be necessary for a server object to provide different privilege levels to different clients.
One of Spring’s goals is to provide secure access to objects, so that object implementations can control
access to particular data or services. To provide security Spring supports two basic mechanisms: Access
Control Lists and software capabilities.
Any object can support an Access Control List (ACL) that defines which users/groups of users are
allowed access to that object. These Access Control Lists can be checked at runtime to determine whether
a given client is really allowed to access a given object. When a given client proves that it is allowed to
access a given object, the object’s server creates an object reference that acts as a software capability.
This object reference uses a nucleus door as part of its representation so that it cannot be forged by a
malicious user. This door points to a front object inside the server. A front object is NOT a Spring object,
but rather whatever the server’s language of implementation defines an object to be.
A front object encapsulates information identifying the principal (e.g., a user) to which the software
capability was issued and the access rights granted to that principal.
A given server may create many different front objects, encapsulating different access rights, all
pointing to the same piece of underlying state.
Later, when the client issues an object invocation on the object reference, the invocation request is
transmitted securely through the nucleus door and delivered to the front object. The front object then
checks that the request is permissible based on the encapsulated access rights, and if so, forwards the
request into the server.
For example, if the client issued an update request, the front object would check that the encapsulated
access included write access. When a client is given an object reference that is acting as a capability they
can pass that object reference on to other clients. These other clients can then use the object reference
freely and will receive all the access that was granted to the original client.
For example, say that user X has a file object foo, which has a restricted access control list specifying
that only X is allowed to read the file.
However X would like to print the file on a print server P. P is not on the ACL for foo, so it would not
normally have access to foo’s data.
However, X can obtain an object reference that will act as a software capability, encapsulating the
read access that X is allowed to foo. X can then pass that object reference on to the print server P and P
will be able to read the file.
The use of software capabilities in Spring makes it easy for application programs to pass objects to
servers in a way that allows the server to actually use the given object.

For example, let's say that the user wants to print a file foo.
• The user, of course, has full access to the file system for this particular object, that is the file that
the user has created. This is a reference to the object foo and user has full access to that.
• It wants to print the file but it doesn't want to give any more privilege than the printer needs to.
• In particular, if I want to print a file, then all I need to do is give a one-time privilege to the file
object in order to print that file. So what I'm going to do is I'm going to take this capability that
I've got for this file foo, reduce the privilege level and say that you've got a reference to the same
object, but you have a one-time reference.
• Now the printer object can access the file system and present its capability, and the front object,
which is associated with the file system, will verify that yes, the one-time ticket that this guy has
is not expended yet, and therefore it is allowed to access this file and print it.
• But if it tries to present the same handle again, it'll be rejected by the front object associated with
the file system because this is a one-time reference. The capability that is being provided by the
user the printer is a one-time capability.
So we've seen
• object invocation can happen efficiently through the door mechanism and the thread hand-off
mechanism that I mentioned within a single node
• it can happen efficiently across the network through the proxies
• it can also happen securely by the fact that you can associate policies in front objects that govern
access to the objects.
So these are all the mechanisms that are provided in the Spring kernel and this is where the innovation
happens. Or in another words, the external interface, even though it is a Unix operating system, under
the cover the Spring system does all of these innovation in terms of how to structure the operating system
itself using object technology.
351 - Abstractions Question

This question concerns the abstractions that is available in Nucleus.


Remember Nucleus is the microkernel of Spring. And the question asks, what is the difference between
the primitives, or the abstractions available in Nucleus, and Liedtke's prescription for what a microkernel
should look like?

352 - Abstractions Solution

If you've been with me so far, you know that nucleus is providing only threads and IPC. It doesn't provide
the abstraction of an address space in the Nucleus.
Whereas Liedtke's description says you should have all three in the microkernel, and in fact, the Spring
system does have it. It is just that in the Spring system they name things differently.
Spring has Nucleus + VMM = kernel, and the idea of a kernel in the Spring system contains all three
entities, even though the nucleus doesn't contain the address space.
353 - Virtual Memory Management in Spring

An address space object represents the virtual address space of a Spring domain while a memory object
is an abstraction of memory that can be mapped into address spaces. An example of a memory object is
a file object. Address space objects are implemented by the VMM.
In the Spring operating system, there is a per-machine virtual memory manager, and the virtual memory
manager is in charge of managing the linear address space of every process.
• The linear address space object of a process is what the architecture gives you, and what the VMM
does is to break this linear address space into regions (you can think of regions as a set of pages).
And each region can be of different sizes.
• The second abstraction in the Spring VMM is what is called a memory object. The idea of breaking
up this linear address space into regions is to allow these regions to be mapped to different
memory objects.
• For instance, this region is mapped to this memory object, this region is mapped to a portion of
this memory object ad these two different regions of the same address space are mapped to the
same memory object.
What are these memory objects? The abstraction of a memory object allows a region of virtual memory
to be associated with a backing file or a swap space on the disk. So this memory object is the mechanism
by which portions of the address space can be mapped to different entities, which maybe on the disk as
swap space or files in a file system. It is perfectly possible that multiple memory objects may map to the
same backing file.
So the way to think about these abstractions is linear address space broken into regions, regions
mapped to memory objects, and memory object is an abstraction for things living on backing store
(meaning a disc, which could be the swap space, or specific files).
Memory objects do not have page-in/out operation, this is handled by external pagers.
354 - Memory Object Specific Paging

So here is a VMM and it is responsible for breaking a linear address space into regions and mapping
those regions to specific memory objects. (memory object is on the DISK !!!)
For a particular process that is living in an address space to access a particular memory object,
obviously, this memory object has to be brought into DRAM and that is what a pager object is going to
do. It is equivalent to the idea of external pages in other systems, such as Mach.
A pager object is responsible for establishing the connection between virtual memory and physical
memory, then a region of the virtual memory is mapped to the memory object. It is the responsibility of
this pager object to make sure that this memory object has the representation in the physical memory.
The pager object creates what cache object representation for the memory object in the DRAM. So now,
the region becomes available for the process to address, the pager object has mapped this memory object
into this DRAM.
Similarly, a different VMM-2 managing a different address space can similarly map another memory
object and create a cache representation in VMM-2’s DRAM
I mentioned that the VMM can make any number of such mapping between regions of the linear
address space and memory objects. For instance, there's another region of the linear address space that is
mapped to this memory-object-2 and there may be a pager object that governs the paging of this object
into a DRAM representation.
In this example, this Pager-1 is a pager for two distinct memory objects, memory-object-1 and
memory-object-2, both of which are cached by VMM1 on behalf of a process. There are two pager
objects, one for each one of these memory objects.
So the important point is that there's not a single paging mechanism that needs to be used for all the
memory objects. This gives you an ability to have different regions of the linear address space of a given
process associated with a particular memory object using different pager objects. All of these associations
between regions and memory objects can be dynamically created.
For instance, VMM-2 may decide to associate a region in this linear address space to memory-object-
3. If it does that, there is a new pager object that will manage the association between the region of the
virtual address space that is mapped to the memory-object-3 and the cached object representation is the
DRAM of VMM2.
Now this is an interesting situation, because you have memory-object-3 that is shared by two different
address spaces VMM-1 and VMM-2. What about the coherence of the cache representation of this
object that exists in VMM-1 and VMM-2. Who manages that?
Well it's entirely up to the pager objects that we instantiated.
If coherence is needed for the cache representation of memory-object-3 in the DRAM of VMM-1 and
VMM-2, then it is a responsibility of these two pager objects to coordinate that. So it's not something
that Spring OS is responsible for. Spring OS only provides the basic mechanisms through which these
entities can manage the regions that they are mapping (the memory objects) and the DRAM
representation of those objects.
The VMM obtains data by invoking a pager object implemented by an external pager, and an external
pager performs coherency actions by invoking a cache object implemented by a VMM.
So in summary, the way memory management works in the Spring system is.
• The address space managers are responsible for managing the linear address space of a process.
• They carve the address space regions and associate the regions with different memory objects.
• These memory objects maybe swap space on the disk or it could be files that are being mapped
into specific regions of the linear address space.
• Entirely up to the application, what they want to do with it, but these abstractions are powerful for
facilitating whatever may be the intent of the user.
• Mapping the memory objects to the cache representation, which lives in DRAM, is the
responsibility of pager objects.
• And you can have any number of external pagers that manage this mapping.
In particular, through this example I've shown you that you can have, for a single linear address space,
multiple pager objects that are managing different regions of that same address space. And that's the
flexibility and power that's available in the Spring OS.
355 - Spring System Summary

So to summarize the facilities of the perimeters available in the spring system.


• Object orientation technology permeates the entire Spring operating system design. It’s used as a
structuring mechanism in constructing a network operating system.
• To break it down, in the Spring system you have the Nucleus which provides you threads and IPC
among threads.
• The microkernel prescription of Liedtke is accomplished by the combination of nucleus, plus the
address space management, which is part of the Spring System's kernel boundary.
• Everything else lives above this kernel, meaning all the services you normally associate with an
operating system such as file system, network communication and so on, were all provided as
objects that live outside of this kernel.
• The way you access those objects is through doors. In every domain there is a door table that has
a set of capabilities that a particular domain has for accessing doors on different domains. This
door and door table is the basis for cross domain calls.
• Through the object orientation, and through the network proxies you can have object invocation
implemented as protected procedure calls both on the same node and across machines.
• Finally, it does virtual memory management by providing certain basic parameters, such as the
linear address space, the memory object, external pagers, and cached object representation.
Now to contrast this to Tornado. In Tornado also we saw that it was using object technology, but Tornado
uses clustered object as an optimization for implementing services. For example, weather a particular
object is singleton representation, or it has multiple representation for each processor, etc. Those are the
kinds of optimizations that are being accomplished using the clustered object in the Tornado system.
Whereas in the Spring system, object technology permeates the entire operating system design in that it
is used as a system structuring mechanism, not as just an optimization mechanism. In constructing a
network operating system.
356 - Dynamic Client Server Relationship

Spring is a network operating system and the clients and the servers can be on the same machine, can be
on different nodes on a local area network and in the Spring system. What they wanted to do was this
idea of extensibility. They wanted to carry it to saying the client and the server should be impervious to
where they are in the entire network. So the interaction should be freed.
In other words, the client-server interaction should be freed from the physical location of the clients
and the servers. For instance, in this picture, the clients and the servers are on the same machine. We've
decided to replicate the servers in order to increase the availability. Now we have several copies of the
servers and the clients are dynamically loaded to different servers for load distribution.
For those of you who are familiar with how services like Google work today, this is exactly what
happens in services that we use on an everyday basis when we access Google. Our client requests are
being routed to different servers and this is the same sort of thing that is happening in the Spring system.
Once you replicate the server, you want the client request to be routed to different servers depending
on the physical proximity of the client to the servers, as well as the load that is currently being handled
by one server versus another.
Another variation of the same theme is where the server is not replicated, but the server is cached. For
instance if it is a web server, there could be a proxy which have cached the original web server content.
In that case the client request need not go to the origin web server, but it can go to the cached copies that
are available. So here again this decision of routing a client request to a particular cached copy of the
server is dynamically taken.
Not all of this sounds like magic in terms of how this client server relationship is being dynamically
orchestrated, whether are in the same machine, or whether we dynamically decide to replicate the servers
and decide to route the request to different servers, or we want to cache the servers and route the client
request to different cache copies. All of these are dynamic decision that are taken. And how is this done?
Well that's the part that we're going to see next.
357 - Subcontract

The secret sauce that makes this dynamic relation between the client and the server possible is this
mechanism called subcontract. It's sort of like the real life analogy of off-loading work to a third party.
This mechanism allows control over how object invocation is implemented, over how object
references are transmitted between address spaces, how object references are released, and similar object
runtime operations
Subcontract is a mechanism to hide the runtime behavior of an object from the actual interface. For
instance, there could be a singleton implementation of the server, or it could be a replicated
implementation of the server. The client does not care about all details of how this client's IDL interface
is satisfied.
So what that means is, the client side stub generation becomes very simple because all of the detail of
where the server is, how to access the server, whether the server is on the same machine and whether
there are multiple copies of the server, which copy of the server should I go to? All of those details are
in the subcontract mechanism. That makes the client side stub generation very simple.
Subcontract lives under the covers of the IDL contract and you can change the subcontract at any time.
So, for instance, if you don't like the work being done by one contractor, you give it to a different
subcontract. Same sort of thing that can happen here is that the subcontract is something that you can
discover and install at runtime. In other words, you can dynamically load new subcontracts. For instance,
if a singleton server got replicated, then you get a new sub-contract that corresponds to this replicated
server, so that now you can access the replicated servers using the subcontract. Nothing needs to change
above this line. The client stub doesn't have to do anything differently. All of the details are handled by
this subcontract, seamlessly. So in other words, you can seamlessly add functionality, to existing services,
using the subcontract mechanism.
358 - Subcontract Interface for Stubs

Now let's look at the interface that's available for the stub that is on the client side and the server side
through the subcontract mechanism.
The first interface is for marshaling and un-marshaling. The client side stub has to marshal the
arguments from the client. The subcontract will do that for you. Whether this invocation is going to go
to a server that is on the network or is it on the same machine same machine, all those details are buried
in the subcontract. Therefore, when the client stub wants to marshal the arguments for a particular
invocation, it just calls the subcontract and says please marshal these arguments for me, and the
subcontract knows the way in which this particular invocation is going to be handled, and so it can then
do the appropriate thing for marshaling the arguments based on where the location of the server is.
That's the beauty of the subcontract mechanism, and this is true on the server side as well as on the
client side.
Once the marshaling has been done, the client side can make the invocation. When it makes the
invocation, once again the subcontract says I know exactly where this particular invocation is going to
go to. So it takes care of that.
On the service side the subcontract gives a different set of mechanisms. It allows the server to revoke
a service, or it allows a server to tell the subcontract that yes, I'm open for business by saying I'm ready
to process invocation requests.
So what you see is that the client side and the server side, the boundary is right here. The client stub
and the server stub don't have to do anything differently, whether the client and the server are in the same
machine or in a different machine, replicas or cached copies, none of those things make a difference in
terms of what the client application plus the client stub has to do.
All of the magic happens down below in the subcontract mechanism.
So to recap, the innovations in the Spring OS,
• It uses object technology as a structuring mechanism in building a network operating system.
• It ensures through the object technology that it is providing strong interfaces.
• It is open, it is flexible, and it is also extensible because it is not a monolithic kernel.
• It has a microkernel, and all the services are provided through these object mechanism living on
top of the kernel.
• The other nice property is that the clients and the servers don't have to know whether they are co-
located on the same node or they exist on different nodes of the local area network.
• Object invocations across the network are handled through the network proxies.
• The subcontract mechanism allows the client and the servers to dynamically change the
relationship in terms of who they are talking to. You can get new instances of servers instantiated
and advertise that through the subcontract mechanism so that the clients can dynamically bind to
new instances of servers that have been created without changing anything in the client side
application or the client side stub.
So those are all the powers that exist when you decide how to innovate under the covers, which is exactly
what Sun did with the Spring OS.

359 - Conclusion

The journey in this lesson should have given you a good idea of how it is possible to innovate under the
covers. Externally, Sun was still peddling UNIX boxes, but internally they had completely revolutionized
the structure of the network operating system through the use of object technology.
If fact, the subcontract mechanism that Sun invented as part of the Spring system forms the basis for
something that many of you who are Java programmers are using a lot, namely Java RMI. In this lesson
that you're going to look at, we are going to study Java RMI and also Enterprise JavaBeans.
360 - Introduction - Java RMI

In this lesson, we will continue to see examples of how, distributed object technology is influencing
commercial offerings in the computer industry. First, we'll discuss java RMI, which has its roots in the
basic principles of distributed systems that we have been seeing so far. Before we start talking about Java
RMI ,let's have a fun quiz to prime the pump.
Reference:
Wollrath, A., Riggs, R., and Waldo, J., "A Distributed Object Model for the Java System", Usenix
Conference on Object Oriented Technologies and Systems, May 1996.

361 - Java Language Question

Now this question is asking you what was JAVA originally invented as a language for use in?

362 - Java Language Solution

Some of you may not have been born yet, but this this language Java was originally invented as a
language for using embedded devices in the early 90s.
363 - Java History

Java was invented by a gentleman by the name of James Gosling at Sun.


It was originally called Oak, and it was intended for use with PDAs.
Then when in the 90s, there was a lot of interest in video on demand using the internet, Sun thought
that Java maybe the right language for programming set-top boxes, but unfortunately the cable TV
industry went with SGIF for the VOD trials, and so Oak fell flat at that point and Sun all but gave up on
Oak.
Then the World Wide Web caught on and Java got a new life with the need for containment of what
happens on the client boxes connecting to the World Wide Web.
Today a lot of internet e-commerce depend a lot on the Java framework.
The intent in this lesson is not to talk about the Java language itself but the distributed object model
of Java.
364 - Java Distributed Object Model

The nice thing about the Java remote object model is that much of the heavy lifting that an application
programmer has to do when building a client-server system using RPC are all subsumed under the covers
by the Java distributed object runtime. This is where one can see the similarity to the subcontract
mechanism in the Spring OS.
Before we dig deeper, let me give you at a high level the distributed object model of Java.
• The term remote object in the object model of Java, refers to objects that are accessible from
different address spaces.
• The term remote interface is used in the distributed object model to declare the available methods
in a remote object.
• In the distributed object model of Java, the clients have to deal with RMI exceptions.
There are some similarities and differences between local objects and remote objects.
• The similarity is that you can pass object references as parameters when you make an object
invocation. An object invocation arguments of the invocation could include object references.
• The difference is that the parameters is passed as value or result.
In the case of local object, when you pass an object reference as a parameter, then the method that is
invoked can reach into that object that has been passed as a parameter as a reference and make
modifications to it and that modifications will get reflected in the original object. But in the distributed
model, because the object references are passed as value result across a network, so the copy of the object
is actually sent over to the invoked method. So that's a fundamental difference in parameter passing.
So they're both similarities in the sense that you can pass object references as a parameter. But the
difference is the reference is passed in a value result mode, as opposed to a pure reference. In other words,
once an object reference has been passed as a parameter to the server, if the client makes changes to that
particular object, the server will not see those changes. That's fundamentally different from between the
local object model of Java and the distributed object model of Java.
365 - Bank Account Example

In order to implement a remote object, one must first define a remote interface for that object. A remote
interface must extend (either directly or indirectly) a distinguished interface called java.rmi.Remote.
This interface is completely abstract and has no methods.
• interface Remote {}
Let's put this distributed object model of Java to work.
The example that I'm going to construct is a bank account server.
The server has APIs for accessing your bank account: you can deposit, withdraw, and check the
balance. Now, the question is how to best implement it using Java.
In particular, given that, there is the remote object and the remote interface available as mechanisms
in the distributed object model, what would be the best way to construct this service as a distributional
object, accessible from clients anywhere in the network? Let's consider two possibilities.
366 - Reuse of Local Implementation

The first option is to reuse a local implementation.

Let's say the developer has a local implementation of a class called Account and.
• The built-in Remote interface should be extended to BankAccount interface, which should
include all the methods interface.

• This Account class should be extended to become publicly accessible, i,e, BankAcctImpl.
o This has to be done explicitly by the developer.

o It requires that the class deal with the details of making instances of that class remotely
accessible (by exporting the object to the RMI runtime)

o It also requires the implementations be responsible for their own Java Object semantics
and therefore must redefine methods inherited from the class Account appropriately.
367 - Reuse of Remote Implementation

Now a second choice is reusing the RemoteObject class that's available in the Java distributed object
model.
• Similarly, the BankAccount interface is extended from the built-in Remote interface by the
developer.
• However, note how the BankAccImpl is actually derived. It is derived from the Java built-in
classes RemoteObject, and RemoteServer. So, you extend the RemoteObject and the
RemoteServer to get this BankAccImpl object.
• If you derived the BankAccImpl object from the built-in class (RemoteObject and RemoteServer),
it becomes instantly visible to the network clients when you instantiate the BankAccImpl object.
You don't have to do any of the heavy lifting.
• The default constructor for RemoteServer takes care of making an implementation object
remotely accessible to clients by exporting the remote object implementation to the RMI runtime.
• The class RemoteObject overrides methods inherited from Object to have semantics that make
sense for remote objects.
The “remote implementation reuse” scheme is more seamlessly integrated into the Java object model as
well as requiring less implementation detail.
368 - Implementation Preference Question

Which implementation, would you prefer, would you prefer deriving your service by extending the local
implementation or by extending the remote implementation?

369 - Implementation Preference Solution

In the first case when we used the local implementation and used only the built-in interface Remote. The
implementor needs to make instances of the object remotely accessible. So the heavy lifting needs to be
done by the implementor, which is not quite preferable.
One virtue of the local implementation is that the service provider can make selected servers visible to
the selected clients. So that's the only virtue that one can associate with the first implementation that uses
local. You cannot make a hard case, a strong case for this choice.
So this choice where we are extending the Java RemoteObject and the RemoteServer, we let Java RMI
do all the heavy lifting to make the server object visible to the network lines. And that's the more preferred
way of building a network service, and making it available for clients anywhere.
370 - Java RMI at Work (Server)

So let's see Java RMI at work.


On the server side, it's a three-step process to make the server object visible on the network.
• You instantiate the object.
• Then you create a URL, whatever you want to call the URL.
• Then you go to the Java run time, and bind the URL with the instance of the object that you created.
Now it is in the naming service of the Java runtime system for clients to discover and use.
371 - Java RMI at Work (Client)

Now let's look at the client side. Look at the ease with which any arbitrary client on the network can
access the server object.
• The client will look up the service provider, by contacting a boot strap name server in the Java
RMI system.
• It does a look up of the URL using the facility in the Java RMI system, then a local access point
for that object is created on the client side.
• Now we've got the access to the object that is at the server through this local name acct. Once I
have that, then I can do invocations on the methods by simply calling those methods like a normal
Java class. (e.g. acct.deposit(), acct.withdraw())
All of this, they look like normal procedure calls so far as the client is concerned. But each of this is
really a call that is going out to the server wherever that server happens to be, and the Java Runtime
system knows how to locate that server in order to do this invocation.
That's the power of the Java RMI. The client does not know and does not care the location of the
server.
If there a failures in any of these function executions, then remote exceptions will be thrown by the
server through the Java run-time system back to the client.
Of course with the networked nature of this client/server relationship, if a remote expression is thrown
and the client sees that the invocation did not succeed, it may have no way of knowing at what point in
the invocation the call actually failed. This is one of the problems when you have services that you have
to reach across the network and you have to handle the exception.
372 - RMI Implementation (RRL)

Now let's look at how the RMI is actually implemented?


• At the core of the RMI implementation is this layer called Remote Reference Layer, RRL and
that's the place where magic happens.
• The client side stub initiates a remote method invocation call using RRL, and all the marshaling
the arguments is handled entirely by RRL. Similarly when the result comes back, un-marshaling
the results is done by RRL.
• On the server side, RRL un-marshal the arguments from the client for the server skeleton. The
skeleton then makes the call up to the server implementation. Once the server is done with the
service, the skeleton marshals the result through RRL to send it over to the client.
Marshaling and un-marshaling are also called serializing and deserializing Java objects. All of that is
being done by the RRL.
Now where are the clients and the servers? Are they on the same machine, on a different machine? Of
course we're talking about networked services so the server's going to be remote. But the server could
have several instances. There could be a single instance of a server, or there could be multiple instances
of the server. Where is all that magic happening? Well, similar to the subcontract layer that we discussed
in Spring OS, the RRL doing all the magic with respect to where the server is and how the server is
handling request (is it replicated or a singleton)
So what that means is it allows for various invocation protocols between the client and the servers and
all those things are buried in RRL. The actual clients and the servers can be impervious to those details.
So, in the Java run time stack, the RRL is a very crucial layer and it has functionalities very similar to
the subcontract mechanism and the Spring OS.
373 - RMI Implementation Transport

The abstractions that the transport layer provides are endpoint, transport, channel, and connection.
Endpoint can be thought of as nothing but a protection domain or (or a Java virtual machine). It has a
table of remote objects that it can access. It gives you a protection domain or a sandbox for execution of
a server code or a client code can exist within the sandbox.
Connection management is the interesting piece, which is all about the details of connecting these end
points together.
• In particular, the connection management of the transport layer of the Java Runtime system is
responsible for setting up connections, tearing down connections, listening for incoming
connections, and establishing the connection. When a connection is established between two end
points, there's a distraction that I mentioned called transport comes into play. So, for instance
between this end point, and this end point, the connection manager decided to have UDP transport.
Meanwhile, between this endpoint and this endpoint, the connection manager decided to use a
TCP channel, so the transport that is being used here is a TCP connection, both ends. Notice that
a given endpoint, can have different transport for talking to different endpoints depending on a
variety of parameters. What is the best way for this endpoint to talk to this endpoint may decide
what kind of connection is used. That is all part of connection management.
• The connection manager is also responsible for locating the dispatcher for a remote method that
is being invoked on this end point. So a transport is listening on a channel and when an invocation
comes in, this transport is responsible for identifying or locating the dispatcher on this domain,
which will know how to carry out that invocation.
• The connection managers also responsible for managing the liveness of the connection. Because
if any point goes away, it needs to know that and inform this domain that, oh this particular end
point is gone. So that kind of liveness monitoring is part of connection management.
The last abstraction I mentioned is the notion of connection itself.
So once a channel has been established, the transport can do I/O on this channel using connections.
The path for the transport layer is: it listens for an incoming request. When an incoming request comes
in, it chooses a transport that is most appropriate and establish a channel. Once the channel has been
established, these two endpoints can do I/O on the channel using the connection.
That's how the transport mechanism of RMI works.
As we saw, the transport mechanism sits below the RRL layer. It allows all the object invocations
to happen through the transport layer. The RRL layer is the one that is deciding, what are the right
transport (e.g. TCP or UDP) to use depending on the location of the two endpoints, where the client is
and where the server is. It gives that command to the connection manager which is part of the transport
layer of the software stack so that the channel can be established and then a connection can be used for
actual transport of the implementation between the client and server.
In summary, the distributed object model of Java is a powerful vehicle for constructing network
services. What we see in this lesson is a glimpse of the classes that are available in the distributed object
model. The power of the RRL in dynamically deciding how to make the client-server relationship, is
similar to the subcontract mechanism in Spring OS. We also see the flexibility in the connection
management, allowing different kinds of transport exist between the client and the server depending on
the location of the client and network conditions and so on.

374 - Conclusion

There are some more subtle issues involved in the implementation of the RMI system, including
distributed garbage collection, dynamic loading of stubs on the client side, sophisticated sandboxing
mechanisms on the client and the server sides to ward off security threats and so on.
I encourage you to learn about these issues by reading the assigned paper and also surfing the internet.
The main point I want to leave you with is that many ideas that start out as pie in the sky research
becomes usable technology when the time is ripe.
375 - Introduction - Enterprise Java Beans

Welcome back! Let's connect the dots. We started with the technical issues in the structure of an operating
system for a single CPU, then a parallel machine, then a distributed system.
We saw how object technology with its innate concepts of inheritance and reuse helps in structuring
operating systems at different levels.
Now, we go one step further. How do we structure the system software for a large scale distributed
system service? It's too limiting to call it an operating system.
As we continue this lesson, we'll get a glimpse of how object technology has gone ballistic to provide
the services that you're reliant on for your everyday internet e-commerce experience.
In this lesson, we will describe enterprise java beans and the term java bean is used to signify reusable
software components. That is many objects, java objects, in a bundle so that it can be passed around
easily from one application to another application for reuse.
Reference:
Emmanuel Cecchet, Julie Marguerite, Willy Zwaenepoel, "Performance and Scalability of EJB
Applications", Proceedings of the 17th ACM SIGPLAN conference on Objectoriented programming,
systems, languages, and applications. Paywall
376 - Inter Enterprise View

I'm sure all of us use, routinely, services on the internet such as email through Google or Yahoo and
perhaps purchase things using eBay or make orders for airline reservations and so on. When we do that,
we think of an enterprise that we are accessing from our work station or laptop or personal mobile device.
We think of an enterprise as a monolithic entity.

But, in fact, if you look inside the enterprise, the enterprise, the intra enterprise view, is pretty
complicated. There's a whole bunch of services and servers that are interconnected, there may be
marketing division, the sales division, the production division, inventory division, the research division,
and so on. All of these constitute what an enterprise is. So internally the view of the enterprise is much
more complex than what you see from the outside coming in and using services provided by a particular
enterprise.
Things get a lot more complicated in this day and age because when we access an enterprise, in fact,
the enterprises, they talk to one another. And this is, what is usually called a supply chain model and so
on where the service that you are requesting may not be serviced by a single entity but may actually
involve the entity contacting other entities in order to put together a solution for a particular request.
What is even more challenging is when enterprises merge. This happened a while back and there's a
company called Digital Equipment Corporation that got bought out by Compaq. Those two merged. Later
on HP bought out Compaq and so you can see that when things like this happen, the inter-enterprise view
is much more complex. When companies merge like this, the idea of an enterprise becomes an amalgam
of three different entities coming together, in this example for instance.
So the enterprise transformation challenges are many: interoperability of the systems that constitute
different enterprises, interface compatibility when such merging happens, system evolution.
You know, things are not stagnant. Now this transformed enterprise has to continuously evolve as
well.
Scalability, reliability and the cost of maintaining a complex system like that.
All of these things are the challenges that have to be faced both internally and across enterprises.
377 - An Example

We often refer to services that we're using on an everyday basis, such as airline reservation or Gmail, as
giant scale services, as opposed to the services that you get within your organization, e.g. a file server.
There's a later module in this course called Internet Scale Computing, and in that we will discuss
programming models and resource management issues for providing such giant scale services. The focus
of this lecture is to show how object technology facilitates structuring such services.
Let's look at an example to put things in perspective. Let's say you want to purchase a round trip ticket
to go from Atlanta to Chennai, India.
• With a few clicks, you can send your request over to a portal such as Expedia.
• Expedia then goes to work for you. It contacts a whole bunch of different airlines, gets the best
options that are available from all these different choices, and then it comes back to you with a
bunch of options.
• Now you may take your own sweet time deciding which one of those you may want to pick, based
on cost, perhaps convenience, guarantees.
• You may want to make a decision or maybe you have to talk to your spouse or significant other,
siblings, children, so on.
• Finally you decide and commit to buying the ticket, and then Expedia will complete the
transaction based on the choice that you made. And you get your ticket and life is good.
Well, not so fast.
While you are busy procrastinating with your choices, there's another person who is planning almost
an exact similar trip to yours. Same dates, same constraints, same destination, and so on.
You can immediately see that, without your realizing, you are actually competing for resources.
In this case, a physical resource, a seat on a particular flight going from Atlanta to Chennai, India,
with others that you don't even know exist on this planet.
Therefore, the service provider, in this case Expedia and all the airlines together that are handling your
request, they all have to work together to make sure that any resource conflict that might occur between
simultaneous requests across space and time coming from several different clients is handled properly.
So all the issues that we've discussed in the context parallel and distributed systems, synchronization,
communication, and immediacy of actions, concurrency, all of these become important. They surface in
this very simple example across space and time.
Additionally, all services need some common features. For example, a shopping cart on your browser.
In fact, even though this particular example is illustrating an airline reservation, if it comes to booking a
ticket on a train, or getting hotel reservation, or booking tickets to go see a game, all of those things have
similar requirements. Many of them are probably repeatable and many of them, such as the shopping cart
in this example, are features that might be needed even if the services that you're talking about are
completely different, such as an airline reservation and hotel booking.
So since the same issues crop up in the implementation of each new service, we don't want to reinvent
the wheel every time. This is where object technology comes in, the power of reuse of components.
378 - N Tier Applications

Now such applications are what are called N-tier applications because if you look at the software stack
that comprises an application such as this, you'll see several different layers.
• You have a presentation layer, which is responsible for painting the screen on your browser,
perhaps dynamically generating the page based on the request you made.
• There may be application logic that corresponds to what the service is providing.
• There is business logic that corresponds to the way airfares are decided, seats are allocated, all
these kinds of things.
• There's a database layer that accesses the database that contains information about all the things
that the application and the business logic have to decide on in order to satisfy a particular request.
All these different layers have to worry about many of the issues that we're already familiar with, in the
context of writing parallel programs and distributive programs.
• Those include persistence for actions. For example, let's say I made a choice, but I haven't
completed the booking. I may go away and come back later on, in order to complete that booking.
So persistence is something that I might need.
• You need a notion of transaction because I have initiated a particular operation and I have not
completed it and so transaction properties may be needed in order to make sure that a reservation
that started is finally complete and I have made the booking.
• Caching of data that you pulled in from a database server so that you can access the database more
quickly.
• Clustering, which corresponds to taking a set of related services and clustering it together in order
to improve the performance of the service. And similarly clustering the data that you're accessing
from a data base server.
• Of course one of the things that we worry about a lot this days in Ecommerce is security, in
particular when we are communicating financial information, credit card information, and
personal information like social security ID and so on. We worry a lot about services provided by
the server and that my personal information is not compromised in any fashion.
So these are all the sets of issues that N-tier Applications have to worry about in making sure that the
services it provides are trustworthy from an end user's point of view.
How do we structure an N-tier application like this?
The things that we want to do is to reduce the amount of network communication because that results
in latency, to reduce security risks for the users which means that the business logic should not be
compromised and to increase the concurrency for handling an individual request.
For instance, there's an individual request, but in processing this request, there's an opportunity to
exploit parallelism. Often times, these are called embarrassingly parallel applications because even
though this request seems like a single request, there's an opportunity to exploit parallelism and the kind
of parallelism is embarrassingly parallel because in the same query I want to find out the availability of
seats on a particular date and I don't care about which airline I go by. That's an opportunity for parallelism
for the Expedia server to go in parallel to multiple airlines and find out the availability of seats on the
dates that I requested.
Similarly, there's opportunity for exploiting concurrency across several simultaneous requests that are
coming in. And also for clustering the computation that may have to be done on the server for
computations that are common across simultaneously arriving requests.
We want to reuse components aggressively. By components we mean portions of the application logic
that can be reused in constructing these applications as well as in the execution of the components in
order to service the requests that are coming in simultaneously from several different clients.
379 - Structuring N Tier Applications

To structure the N-tier applications we're going to talk about one particular framework as an example.
It's just as an example, there are other frameworks that provide similar functionality to the JEE framework
which is called the Java Enterprise Edition framework.
In the JEE framework there are four containers for constructing an application service. You can think of
containers as protection domains implemented typically in a Java virtual machine. The four containers
are:
• The client container
• The applet container for the client, which will reside typically on a web server. This is the one
that interacts with the browser on the end client.
• The presentation logic that I mentioned to you earlier is provided in the web container, and this
is the guy that is responsible for dynamically perhaps creating the pages that have to be sent by
the web server back to the browser of the client.
• There is an EJB container, which manages the business logic that corresponds to what needs to
be done in order to carry out the request from the end client. There may be a database server, that
the business logic is communicating with in order to get access to the data that it needs to process
the request that came in.
The key hope is that we want to exploit reuse of components as much as possible.
• For this purpose, continuing sort of the coffee analogy starting with Java, the word bean is used
to indicate a unit of reels, that is, a bundle of Java objects providing a specific functionality.
• For example, there may be a bean that provides the shopping cart function. So that becomes a unit
that reuse in constructing an N-tier application.
• The containers that I talked about here host the beans. That is, a container allows you to package
a whole bunch of Java beans and make it available in this container.
In the JEE framework there are four types of beans.
One type of bean is called an entity bean. For instance an entity bean maybe a row of a database. If
you think about the database holding employee records for instance, one drawer of the database may
correspond to all the employees whose last names start with the alphabet," a". And typically entity beans
are persistent objects with primary keys. The persistence for the entity bean may be built into the bean
itself (bean management persistence), or it may be built into the container into which that entity bean
is instantiated. In either case since we are dealing with objects that may need persistence, it is important
that the persistence for that object is handled either in the entity itself or the container where that entity
is hosted.
The second type bean is what is called a session bean. The session bean typically is associated with
a particular client and a particular session, meaning a temporal window over which a client is interacting
with the service. A session bean could be a stateful session bean. For example, if I am ordering by
contacting a portal for Dell, the session that I'm establishing with the Dell portal has to be stateful,
because it has to remember what choices I'm making. I may actually keep those choices alive, go away
for a while, come back the next day, and continue with my purchase. There could also be stateless
sessions. For instance if I start an email session using my browser with Google using Gmail, that session
maybe stateless. I go away, everything that I did during that session can be thrown away because I'm
going to start a brand new session when I re-initiate a connection with a Gmail server.
The third type of bean is what is called a message-driven bean. This kind of bean is useful for
asynchronous behavior. For instance, I might have a stock quote ticker on my browser and I might want
to get updates on the movement of stocks of a particular company that I'm interested in. That would be
something that is accomplished using a message bean. News feed or RSS feeds are also examples of
message driven beans.
Each of these beans is denoting a functionality, but if you can have fine grained versions of these
beans, it gives a greater opportunity to enhance a concurrency for dealing with an individual request
that's coming into your application's server. Or there could be concurrent requests in addition to my own
request, all of those can be dealt with more concurrently if we implement these beans at a finer level of
granularity.
But if you implement the bean at a finer level of granularity that means that the business logic is also
getting more complex. So, there is always this trade off in structuring N-tier applications that we made
have a complex business logic with fine grain concurrency, or I might choose to keep the business logic
very simple with corse grained beans.
This is where the core of what we are going to discuss lies, and that is, we are going to discuss different
design alternatives for structuring such entity application servers.
380 - Design Alternative - Coarse Grain Session Beans

So the first design alternative that we're going to look at uses coarse grain session beans. In this structure
that I'm showing you, we're only looking at the web container and the EJB container. (The applet
container that interfaces with the client is in the web server, so we don't worry about that)
The web container contains the presentation logic.
• In the structure that I'm showing you, each servelet box corresponds to an individual session with
a particular client.
• There's a presentation logic commensurate with servelet-1, that is client1.
• Similarly, presentation logic commensurate with servelet-2, which is client-2.
There's a coarse grain session bean in the EJB container that is associated with each of these servelet.
• As the name suggests, the session bean is responsible for the specific needs of the particular client
that it is serving for this particular session. Therefore, the session bean will worry about the data
accesses that are needed to the database in order for the business logic to do its thing.
• For example, if we're doing an airline reservation system and it is requesting a particular booking,
then the session bean is going to be the one that contacts the database server in order to pull the
specific dates and airline reservation information that is needed for the business logic to do the
pruning and selection commensurate with whatever this particular client is requesting.
• There are multiple sessions that are contained in this EJB container, depending on the number
of clients that have simultaneously temporally made requests to this particular service. So the EJB
container has to provide some service for all the sessions that are concurrently going on in this
server.
All of the data accesses that are needed for a particular session is taken care of by the session bean.
• Therefore, the amount of help that we need from the EJB container, in terms of services, is pretty
minimal for supporting this particular model.
• In fact, it is confined to any conflicts that might arise in terms of external accesses for satisfying
the request of these different session beans. So the EJB container service that would be needed is
primarily for coordinating, if any, across concurrent independent sessions.
• An example would be, if they want to access the same portion of the database for writing some
records. In that case, they may need some coordination help from the EJB container service.
The other important attribute of this structure is that the business logic is confined to the corporate
network.
• It is not exposed to the outside world because it's not contained in the web container.
• It is contained in the EJB container and therefore the business logic is not exposed beyond the
corporate network. That's a good thing.
So, the PROs of this particular structure is that
• You need minimal container services
• The business logic is not exposed to the outside world.
But the CONs for this particular structure is
• This application structure is very akin to a monolithic kernel that we've talked about a lot.
• There is very limited concurrency for accessing different parts of the database.
• For instance, I mentioned that the services provided by these giant scale services tend to be
embarrassingly parallel. So there is a lots of opportunities for pulling in data. Example would be
that the particular query is to compile demographic distribution of all the employees in the
company. In that case, there's an opportunity to pull in lots of data simultaneously from the
database.
• But unfortunately, the structure doesn't allow you to exploit such concurrency.
So in other words, this corse-grain session bean structure represents a lost opportunity for accessing
and pulling lots of data from the database in parallel for satisfying either the same request or even
concurrent requests that may be accessing the same portions of the same database.
381 - Design Alternative - Data Access Object

The second design alternative I'm going to talk about mitigates the exact problem that I had mentioned
earlier.
That is, you want parallelism for accessing the database because, which is probably one of the slowest
link in the whole processing of request because pulling in data from the database is going to take a lot of
time, both in terms of I/O that has to be done through discs as well as the network communication to pull
in the data into the container where the processing needs to happen.
• For that purpose the structure that we have here is to push the business logic into the container
together with the servelet and the presentation logic.
• All of the data access is going to be done through what are called entity beans. As I mentioned
earlier, entity beans have persistence characteristics, and in this particular example i can think of
the entity bean as representing one row of the database.
• So the data access object are implemented using a whole bunch of entity beans and you can decide
as the designer whether an entity bean is responsible for one row of the database or maybe for a
set of rows of the database.
• In any event what we've done is, taking the parallelism in database access and encode it through
the entity bean so that they can be parallel access to the unit of granularity.
• The EJB container now contains these entity beans. Now if a servelet needs to access some portion
of the database, it can form more parallel requests to as many these entity beans as it wants. And
all of those entity beans can work in parallel on behalf of a single client and pull in the data from
the database.
So we are reducing the time for data access by having this parallel structure and exploiting the
available concurrency that may be there in terms of I/O performance, and also even if there are parallel
requests, those parallel requests may want access to the same portion of the database.
If you think about the example I gave you of two difference individuals wanting to make airline
reservation for exactly the same dates and the same set of constraints. Then there may be an opportunity
for this entity bean to cluster the request coming from several different end clients and amortize the
access to the database server across several different clients that are temporally happening at the same
time.
I mentioned entity beams usually are dealing with persistent state, which means that the persistence has
to be provided at some level to these entity beans so that persistence has to be provided to the data
accessed object at some level, which are using these entity beans.
• It could be done at the level of individual entity beans, which is called the bean managed
persistence.
• It could be that the container is providing that facility, in which case the persistence needs of the
data access object is provided by the container and, and that is called container managed
persistence.
So these are two different design choices we can make in this structure. The structure is the same for the
design alternate of two.
• We are using entity beans to implement data access objects. We are deciding the granularity of
the data access object, based on the level of concurrency that you want in constructing this
application service.
• Within that choice there are two possibilities again in terms of how we provide persistence for the
data access object, either by providing it in the entity bean itself, or using the container service to
provide that.
This once again points to opportunities for reuse of facilities that may be available. The same container-
managed persistence may be usable for different types of applications.
So the PRO of this structure is that you can exploit concurrency for data access for the same client in
parallel or even across different clients by amortizing the data access that may be needed concurrently
for similar services that are overlapping in terms of data usage.
There is one CON to this approach. Because we moved the business logic into the web container from
the EJB container, it exposes the business logic to the outside network. We are not confined to the
corporate network but the business logic is exposed outside the corporate network in this design
alternative.
All the data access code that you used to be in the session bean in the previous structure is now in the
entity bean. That's how we get the parallelism in the fact that there are multiple entity beans that are
carrying the same data access code and they could be accessing different portions of the database
concurrently resulting in exploitation of parallelism and reusing the latency for the business logic to get
all the data it needs from the data base in order to do its work.
382 - Design Alternative - Session Bean With Entity Bean

The second design alternative gave concurrency but at the cost of exposing the business logic, and the
third design alternative is going to correct that.
• It's using session bean with entity bean. The idea is that we're going to associate it with each
client session, a session facade. It's a design pattern that allows you to construct a session and
associate it with a particular client.
• For instance, in this case, this session facade corresponds to servelet-1, which corresponds to the
client that it is serving. Similarly the other session facade serves client-2.
• As in the first design, what you see is that the web container contains only the servelet and the
presentation logic.
• The business logic is moved back into the EJB container, and it sits with the session facade.
• We still have the data access objects implemented using the entity bean.
• The session facade is worrying about all the data access needs of this business logic and it will
form out the data access requests. So again, there's an opportunity to exploit parallelism because
you can form out parallel requests to multiple entity beans that handle the data access to different
portions of the database.
• Similarly, we're going to structure this database to be at whatever level of granularity that we think
is the right one. So the entity bean may be responsible for an individual row or a cluster of rows
and so on.
So,
• we can have the granularity that we want for parallel access so that the business logic can be
served in parallel, and at the same time,
• we have moved the business logic back into the EJB container so the business logic is not exposed
outside the corporate network.
We have a couple of choices of how we want to structure the session bean with the entity bean.
• Now the web container is going to use RMI. A remote interface in the distributed object
framework of Java to communicate with the business logic.
• The session facade will communicate with the entity bean either using RMI or we can choose to
construct the interface between the session facade and the entity bean, using local interfaces.
Using the RMI allows us to sort of keep this entity bean wherever we want in the network. On the
other hand, if we chose the local option, we will co-locate the entity beans in the same EJB
container as the business logic and the session facade. The advantage of doing that, is that because
it is local, we don't have to incur network communication to fetch the data from the entity bean.
The PRO of this structure is in some sense getting the best of both worlds.
• We're not exposing the business logic, which was the virtue of the first design alternative.
• We're also getting concurrency through the data access object encoded as entity beans. So you get
the concurrency and the fact that the business logic is within the corporate network.
Now is there a CON as well.
• We are incurring additional network access in order to do the servers that we want for the data
access, and that can be mitigated by co-locating the entity bean and the session facade in the same
EJB container.

So these are the 3 design alternatives that we talked about.


• One is a coarse grain session bean alternative.
• The finer grained data access object alternative.
• The third is putting the session bean as a façade to actually access the data access objects, which
are encoded in entity bean so that you can get concurrency.
Notice that in talking about these different design alternatives, we are only talking about how to break
up the logic of that application, which consists of presentation to the client, the business logic, and
database access. There are lots of things that are needed in addition to worrying about the application
logic itself. Those are things like security, persistence, and so on.
The power of object technology is the fact those are the things that may be common across different
instances of applications so we can re-use them from one application to another. That is the thing that I
wanted to start with and I wanted to give you the different design alternatives that exist in structuring
these complex services using the object technology to reuse components in structuring complex
applications.
383 - Conclusion

So this lesson, showed the power, of the object technology, for structuring complex application servers.
EJB, allowed developers to write business logic, without having to worry about crosscutting concerns
such as security, logging, persistence and so on.
As a homework, what I would want you to do, is understand the design choices that we discussed in
the lesson and analyze qualitatively the performance implications of those design choices, with respect
to concurrency, pooling of resources, number of object classes, lines of code and so on.
And read the paper that I've assigned to you as part of the reading material for this course in this topic
and relate the arguments they present in the paper, against your own qualitative analysis.
A word of caution though, EJB has evolved, considerably, from the time of this paper.
But the principals that are discussed in this paper apply to the way complex N-tier applications are
structured to this day.

You might also like