Page 1 of 21

RESIN CLOUD SUPPORT
3
rd
Generation Clustering
For Elastic Cloud Scaling

Resin Pro version 4.0.25



Technical Whitepaper
By Scott Ferguson and Emil Ong
Software Engineers, Caucho Technology, Inc.

ABSTRACT
Resin 4.0.25 offers unprecedented support for deploying and scaling Java web applications in a cloud
environment. Resin’s cloud support is its 3rd generation clustering technology, and provides Resin
cloud elasticity that is demanded of cloud applications. This paper discusses the technical
underpinnings of Resin 4.0’s sophisticated clustering capabilities that provide reliable and fast
distributed sessions, distributed object caching, and cloud-wide application deployment all while
adding and removing application server instances at will during runtime.







Copyright © 2012 Caucho Technology, Inc. All rights reserved. All names are used for identification purposes only and may be
trademarks of their respective owners.
Resin Cloud



Page 2 of 21
TABLE OF CONTENTS
Abstract .................................................................................................................................... 1
Table of Contents..................................................................................................................... 2
About Resin.............................................................................................................................. 3
Introduction............................................................................................................................. 3
Resin 4.0 Architecture ............................................................................................................. 4
Large Clustering ...................................................................................................................... 6
Distributed Caching and Sessions.......................................................................................... 7
Implementing Resin 4.0's distributed cache ......................................................................... 8
Customizing Cache Configuration ....................................................................................... 10
Resin 4.0 Load Balancer ........................................................................................................ 12
Accessing the Distributed Cache from PHP applications................................................... 13
Cloud-wide Application Replication and Deployment....................................................... 14
Distributed Git Repository.................................................................................................... 15
Transactional updates ........................................................................................................................... 15
High concurrency.................................................................................................................................... 16
Durability.................................................................................................................................................... 16
Incremental updates.............................................................................................................................. 16
Use cases ................................................................................................................................ 17
Setting up a basic Resin 4.0 cluster................................................................................................... 17
Dynamically Re-provisioning Resources among Applications using Multiple Clusters 18
Cloud Hosting Providers....................................................................................................................... 19
Conclusion.............................................................................................................................. 21
About Caucho Technology.................................................................................................... 21


Page 3 of 21
ABOUT RESIN
Resin 4.0.25, Java EE Web Profile certified Java Application Server, offers unprecedented support for
deploying and scaling Java web applications in a cloud environment. Resin supports Java EE Web
Profile APIs (like CDI), full web server with URL rewriting, load balancing, messaging, object caching,
PHP running on the JVM, HTTP caching and clustering/cloud support.
INTRODUCTION
Cloud computing is an environment in which computing hardware can be dynamically reapportioned
to the task at hand, usually using virtual machines. One example of a cloud-computing environment
is a cluster of physical machines maintained in-house by an organization. These physical machines all
run virtual machines, which create the environment for their own applications to run. By using virtual
machines on a number of physical machines, the organization can improve reliability and
performance as well as dynamically provision the appropriate resources to various applications
depending on demand.
An alternative scenario is to contract a third-party cloud computing ISP (cloud hosting providers) in
which the organization provides virtual machine images to the ISP to run. New machine instances can
be obtained from the ISP using a metered payment plan (e.g., Amazon EC2). This approach allows
organizations to gain computing power on demand without the need to maintain hardware.
Both of these scenarios present a great opportunity to organizations in which the demands of its
applications change either periodically or sporadically. For example, agencies that have regular
deadlines for payments, applications, or registrations such as accountancies, universities and
government institutions may have a low level of activity on their sites during most of the year, but
experience huge load in the days and weeks leading up to the deadlines. Another example is when a
news sites experience an unexpected increase load after a breaking news event. By having additional
capacity available for times of high-demand, the organization can better serve their clients. At the
same time, by avoiding the use of unnecessary compute power or by being able to redistribute that
power to other areas, organizations can save money and energy during low-demand periods.
Java and PHP web applications can be difficult to maintain and deploy in such cloud environments
with existing technology. To deploy an application to a set of virtualized servers, administrators need
to create custom virtual machine images and distribute them to the each server. If the application
code is bundled in the virtual machine image, a new image must be constructed and distributed with
each new version of the application. If the application is stored on a networked storage device, it must

Page 4 of 21
be retrieved by each virtual machine and subsequent deployment updates must be carefully
managed.
The applications themselves must also adhere to certain restrictions. In the absence of application
server support, sessions will not be replicated across all machines, which means losing the reliability
and availability advantages of replication. Object caching would require a sophisticated third-party
cache framework that is fast and able to cope with the dynamic nature of a cloud environment. Thus
the applications may be forced to avoid sessions and code to non-standard APIs for caching.
Administrators are then also required to maintain an additional infrastructure exclusively for caching.
Resin 4.0 addresses all of these issues with dynamic session distribution, cached objects, messaging
queues, and application files as servers are added and removed from service. Distributed sessions are
transparent to both Java and PHP applications, which use standard APIs. Object caching is available to
applications via the standard Java Cache (JSR-107) API, Memcached wire protocol and the PHP APC
API. Applications can be distributed to all servers in the deployment via provided plug-ins for tools
such as REST, Resin CLI, Resin Web Admin, Apache Ant, Maven, and Eclipse. The application will
automatically be propagated to all server instances and when new instances are added, they will be
brought up-to-date by the same mechanism. The loss or shutdown of server instances will not result in
the loss of sessions, cached objects, or application data.
This paper describes the architecture of Resin 4.0 that makes these features possible. The
optimizations that Resin 4.0 employs to make cluster-wide caching and deployment fast are also
detailed. Finally, use cases and example deployment scenarios are presented to give a flavor of how
Resin 4.0 scales in production environments.
RESIN 4.0 ARCHITECTURE
The core part of the Resin 4.0 architecture is
the triad, a set of three servers that provide
the central repository for persistent data
and maintain an up-to-date record of the
dynamic servers in the system.
Optimizations allow for quick access to data
at any server in the system while the triad
provides a point of stability and persistence
to reduce management complexity.

The triad servers (left) are the three core servers in a Resin
4.0 deployment. Spoke servers (right) can be brought into
the system at any time and removed at will.


Page 5 of 21
Resin clustering is a classic spoke and hub architecture where the hub is the triad and the spokes are
the spoke servers.
The spoke servers, dynamic/elastic servers, in the system are the workhorses for applications. Started
and stopped at will, they provide elastic scaling. Each dynamic server has access to the shared data
within the system via Java EE sessions or the Java Cache API. Once properly configured, the cost of
starting a new spoke server is simply to start a new virtual machine. When a new dynamic server is
brought online, it contacts one of the triad servers to announce its availability and import all
application data. As applications are updated on the triad, the changes are pushed out to all the
dynamic servers by the triad to keep them updated.
The spoke servers use the triad as their persistent store for session and object cache data. At the same
time, optimizations keep frequently used data in memory on the dynamic servers to improve
application performance and reduce network load. Together with the triad servers, the spoke servers
form a cluster.
Using a combination of triad servers and spoke servers minimizes the complexity of managing an
application deployment. The triad servers are brought up first on system start up and at least one
should be available at any time during the life of the system. Thus these three servers can be the
main focus of administration time and effort because all of the other servers may go up or down at
any time without affecting the functional performance visible to clients. Assuming a virtualized
environment, if one or more triad servers become faulty at any time, a replacement or replacements
can be brought into place quickly. Having three servers in a triad avoids a single point of failure,
allows up to two servers to fail at any given time, and allows normal maintenance of a single server
without downtime.
Resin 4.0 includes a software load balancer that distributes HTTP requests for web application clients.
The triad server keeps track of the current members of the cluster and communicates with the load
balancer to update it on which dynamic servers are available to handle requests. When choosing a
server to handle new requests, the load balancer takes into account the CPU load of a server as well as
the number of simultaneous requests that server is already handling. Depending on the algorithm
selected by the administrator, the load balancer can either direct the request to the least loaded server
to keep load even or to the same set of servers until they are fully loaded to avoid starting new servers.
Once a server has been selected for a new request, subsequent requests from the same client will go
to the same server to avoid unnecessary load times.

Page 6 of 21
Using Resin’s included load balancer is not a requirement. Some organizations may prefer to use a
hardware load balancers, Apache HTTPD or nginx. A major advantage of using Resin’s included load
balancer is that it is cluster aware. When a server is added to the cluster, it can automatically start
taking on load. Other benefits include HTTP proxy caching and URL rewriting. Resin load balancer is
faster than Apache HTTPD and has performance on par with nginx. The Resin load balancer and the
Resin Web Server are easy to configure and integrate nicely with the rest of Resin.
LARGE CLUSTERING
As deployment sizes grow,
many organizations split
their servers into different
networks to improve
reliability, availability, and
manageability. Resin 4.0
clustering supports this
partitioning explicitly by
allowing a cluster to be
split into pods.
Each cluster pod contains
its own triad, which
manages the object cache,
distributed sessions,
message queues, and
application file repositories for a set of spoke servers. This architecture makes it possible to create a
pod for each local network within a single site or a pod for each of a number of geographically
distributed sites. In either case, the dynamic servers need only to coordinate with their local triad to
access cache, session, and application file data for fast retrieval and low network overhead. At the
same time, all the pods are considered to be part of a single logical cluster serving the same set of
applications. Thus cache data and sessions can be shared across the entire network. Moreover any
applications that the administrator deploys will be propagated throughout all the pods, even at
remote sites.
When configuring Resin 4.0, the administrator selects one of the pods to be a master pod. This pod
maintains the authoritative copy of the data that all the other pods use for caching and applications.
Thus if an application needs data not in its local pod, it is easy to find by asking the master pod's triad.

Resin 4.0 clusters may have multiple pods to allow administrators to align their
application clustering with their network partitioning for improved performance.

Page 7 of 21
DISTRIBUTED CACHING AND SESSIONS
Resin 4.0 introduces a new distributed caching architecture to support object caching and HTTP
sessions for applications. Object caching is visible to applications via the Java Cache (JSR-107) API
while HTTP sessions are part of the basic Java EE Servlet specification. From the point of view of the
application developer, these facilities should be used in the same way regardless of the size of the
cluster, allowing arbitrary scaling. Thus the developer is freed from ongoing infrastructure concerns,
while the administrator is given the flexibility to add and remove server capacity at will.
To make caching and sessions truly
transparent to the developer, a distributed
caching architecture is necessary. When
an application running on one server
caches a value, that same value should be
available to applications running on
another server so that they can all take
advantage of the same cached data.
However as far as the developer is
concerned, he or she needs only to use
method calls to get and put data without
needing to know where the data is actually
stored.
Distributed sessions can be handled in a
similar way to improve reliability without
developer intervention. Suppose that an
application on a server starts a new session
with a user, but that server later fails. The
load balancer can then redirect the user on
the next request to another server running
the same application, which retains the
session data without interrupting the
user's experience.
Both object caching and distributed sessions are simply types of caching with different
synchronization requirements. Resin 4.0 uses a single, unified framework to handle both. The next
section will describe the general framework and how each API is built using it.
package example;

import java.io.*;
import javax.seivlet.*;
import javax.inject.Cuiient;
import javax.cache.Cache;

public class NySeivlet extends ueneiicSeivlet {
ÇCuiient Cache _cache;

public void seivice(SeivletRequest ieq,
SeivletResponse ies)
{
PiintWiitei out = ies.getWiitei();

Stiing uata =
(Stiing) _cache.get("my-uata");

if (uata != null) {
out.piintln("cacheu uata: " + uata);
}
else {
uata = geneiateComplicateuBata();
_cache.put("my-uata", uata);
out.piintln("new uata: " + uata);
}
}
}
Above is an example of using the Java Cache API.
Regardless of which server this code is run on in a Resin 4.0
cluster, the behavior will be the same and the same data will
be accessed. The container injects the Cache object.

Page 8 of 21
IMPLEMENTING RESIN 4.0'S DISTRIBUTED CACHE
In Resin 4.0 the triad servers handle the synchronization and storage of cached data. At a high level,
Spoke servers can have local copies of items, but the triad always has the master copy. The spoke
servers have a key that behaves like a version key and has a unique identifier. The spoke servers send
the key to the triad and ask the triad to verify if the requested spoke severs have the latest copy of the
object. If it does, then the triad will send the latest version of the object. If it does have the latest
version of the object, then it can use the local copy in the cache. This system is efficient because large
objects are not copied around unless they are needed. The following is an accurate detailed account
on how this is accomplished.
The metadata used for synchronization and the actual cached data are distributed separately. This key
innovation makes Resin 4.0 highly efficient in managing the cache. By decoupling the storage of the
metadata from the data, Resin 4.0 is able to avoid unnecessary network communication because only
the metadata needs to be compared when checking for updates to the cache. The most common
pattern of using a cache in web applications is to store data that is read more often than it is written.
Thus real updates to cached data are often relatively infrequent and so by sending only metadata
during synchronization, the expense of sending redundant cached data is avoided.
A cache can be viewed as a map from a key to a value. Resin 4.0 uses a fixed size metadata structure
called an m-node that includes hashes of both the cache key and the cache data as well as a version
number used for synchronization purposes. To improve synchronization performance, the structure
of the cache can be viewed as a collection of m-nodes.
Each server in a cluster maintains a collection of m-nodes, but only one triad server “owns” the
management of any individual m-node. If a triad server owns an m-node, it is considered to have the
authoritative, most recent version. The key hash determines the triad owner so finding an m-node is
easy and unambiguous. While there is only one owner, the m-nodes are replicated on all the triad
servers for redundancy in the case of failure or maintenance. Cache data is distributed in a similar way,

Resin 4.0's fixed size m-node structure for the distributed cache

Page 9 of 21
except that because there is no version associated with the data value itself (versions are only a part of
the m-node), no synchronization is necessary, only replication.
Applications running on Resin 4.0 may use the cache for a variety of reasons such as to cache an object
or store session data. When using one of these facilities, the Resin 4.0 server that received the web
request interacts with the distributed cache by communicating with the triad servers in the cluster. To
update a value in the cache, the server hashes the key using the SHA-256 algorithm. The server
computes the m-node's owner using this key hash and then sends the new m-node to the triad server.
Similarly, the server hashes the cache data to find its owner then sends the data itself to its own triad
server. The same triad server or separate ones may own the m-node and the data. Once the server or
servers receive the update, they transmit the data to the other triad servers for redundancy. The triad
servers persist the cache data to Resin optimized database so that recovery from transient errors is
fast. The server that is performing the update does not have to wait for the triad to replicate or persist
the data; this process is performed asynchronously. The process to add a new cache entry is identical.
An application running on a dynamic server updates a cache entry by
contacting the triad servers.

Page 10 of 21
When the application tries to obtain a value from the cache, the server first hashes the key that the
application requested. The key hash determines the triad server, which owns the m-node. The server
then contacts the owner’s triad server to request the m-node. The triad server looks up the m-node
and inspects the value hash to find the actual data. Because the data is replicated on all the servers,
the triad server that owns the m-node also has a copy of the value data. When the m-node owner
server responds to the requesting server, it includes both the m-node and the associated data.
CUSTOMIZING CACHE CONFIGURATION
Multiple caches can be configured within each server for different applications and different
requirements for each cache. Specifically, each cache has a variety of timeout values and other
configuration options that customize the behavior for individual applications. For example, an
application might cache certain computations based on a database query such as a list of current
events. The application would like the list to be updated at least every minute to make sure that new
events are shown to the users. Setting expire timeout to 1 minute would achieve that goal.

An application on a dynamic server retrieves a cache entry for the first time by contacting the triad.

Page 11 of 21
HTTP sessions are another interface to the distributed cache. Specifically, session data can be stored in
the cache under a key using the session ID. However, HTTP session data has different requirements
than most object cache data. One requirement is that sessions be able to expire after a certain
amount of time. However in this case, a session is considered live when it is updated as well as when it
is simply accessed. By setting an idle timeout on the cache storing the session data, the sessions will
timeout properly.
Sessions are used by applications in a very predictable way, so optimizations can be made in the
network infrastructure and the cache to improve performance. If the user interacts with the
application using the same server for the duration of the session, that session data can be kept in local
memory for fast access. Load balancers can force this behavior by implementing sticky sessions, a
way of routing requests from the same user to the same server during a single session. Resin 4.0's load
balancer implements sticky sessions for this reason.
The cache can also be optimized to take advantage of sticky sessions. In the normal case that a single
server is the only one to read and write a session, it can be given a non-exclusive lease on the session
data.

Timeout Description Default
Expire timeout The maximum time without an update that an item is
considered valid.
Infinite
Local read timeout How long a value is used locally before checking with
the triad for updates.
10ms
Idle timeout The maximum time without an update or a read that an
item is considered valid. Typically used for sessions.
Infinite
Lease timeout If items are leased with this cache, this is how long each
lease is issued. Typically used for sessions.
5min
Configuration options for various distributed cache timeouts.
These values apply to all entries contained within a particular cache.

Page 12 of 21
The session itself is stored as a cache entry, so the triad member that owns the entry's m-node will
issue a timed lease on the m-node to the server handling the first request in the session. Until the
lease expires, that server can assume that its local copy of the cache is up to date and therefore never
needs to use the network to check the validity of its copy with the triad.
To maintain reliability, the lease-holding server still writes all of its changes back to the triad. Other
servers may also access and update the session. If any server other than the leaseholder makes any
changes, those changes are sent to the lease-holding server by the triad as they are made. Thus in the
normal case, the access to session data is much faster, but reliability and fail-over are maintained if a
server fails.
RESIN 4.0 LOAD BALANCER
Resin 4.0 includes a software load balancer that can distribute requests to servers in a cluster. This
load balancer is simply a Java EE application, so it runs in its own separate cluster, potentially on a
number of servers. Resin 4.0 clusters maintain a list of the dynamic servers available, the cloud
topology, in a special distributed cache entry. The load balancer cluster acts as a read-only client of
the application cluster's distributed cache, with the application cluster sending updates to the load
balancer cluster each time a dynamic server is added or removed. When one of the load balancer
servers distributes requests, it simply looks in the distributed cache to see what application cluster
servers are available.
The load balancer can distribute requests using a number of algorithms:
• Round robin
• Server load
• Green load balancing
Round robin is the simplest algorithm for load balancing in which the next request is sent to the next
server on the list. The triad servers maintain the list of servers and their order, so if new servers are
added or removed the load balancer will follow that order.

Page 13 of 21
The server load algorithm uses a
set of empirical data to determine
which server in the cluster is the
least loaded, then assigns the next
request to that server. Specifically,
the number of active connections
that the server is handling at the
moment along with the CPU load
of the server is included in a
formula to calculate the server
load. In addition to these
measured values, servers may also
be weighted explicitly to direct
more or less load. This algorithm
also takes into account startup
time for a server and allows it time
to “warm up” so that the
applications are ready to handle
requests.
Green load balancing uses the same measurements as the server load algorithm, except that instead
of directing the next request to the least loaded server in the cluster, this algorithm tries to load a
single server up to a threshold level before moving to the next server. This approach means that only
the number of servers that are needed to handle the current load will be used. Thus servers that are
not necessary during periods of low load can be placed in low power mode to preserve energy and
reduce wear.
ACCESSING THE DISTRIBUTED CACHE FROM PHP APPLICATIONS
Resin 4.0 includes Caucho Technology's implementation of PHP called Quercus. This implementation
is written in pure Java and runs within the Java EE Servlet framework. PHP pages are executed in a
fashion similar to JSPs. PHP contains library functions to implement HTTP sessions as well as object
caching. By implementing these functions using Resin 4.0's distributed cache, PHP applications can
also take advantage of the benefits of the architecture. Moreover, PHP applications can share cached
data with Java applications running within the same cluster.
The above is an example load balancer scenario with two tiers and
multiple load balancers. The application cluster’s triad pushes the
current state of the application cluster to the load balance cluster.

Page 14 of 21
The session API in PHP is straightforward and backed by the distributed cache much in the same style
of Java Servlet sessions. PHP also includes an API called APC to provide object caching which is used
widely in both custom and open source PHP applications such as MediaWiki, WordPress, Drupal, and
many more. APC provides a basic key-value storage mechanism with timeouts much like the Java
Cache API. These functions are also implemented in Quercus using the Resin 4.0 distributed cache
directly.
CLOUD-WIDE APPLICATION REPLICATION AND DEPLOYMENT
Administrators and developers can deploy applications to the cluster using a number of tools
supported by Resin 4.0 such as Resin REST support, Resin Web Admin, Resin CLI, Ant, Maven, or
Eclipse. These deployment tools distribute a Web Archive file (.WAR) to the entire cluster, avoiding
the need to write tedious scripts. The applications are deployed using a transactional protocol, so if a
server in the cluster does not receive the application successfully, it will not start.
Then when deploying an application, these tools contact one of the cluster triad members which will
then distribute the contents of the web application to the other two triad members. When all of the
triad members have replicated the application files, they update the dynamic servers by pushing out
the new data. Once all of the dynamic servers are updated, the new application can be started either
automatically or manually.
If the application is updated,
the administrator can then
repeat this process. However
because most application
updates include many of the
same files, Resin 4.0 sends only
the different files to the other
triad and dynamic servers. This
approach is known as an
incremental update. When a
new spoke server is added to
the cluster, it contacts the triad
and downloads all the
applications currently deployed
to the cluster.
Once an application is deployed to one of the triad members, it is
replicated across the other two (A). When all of the triad servers have
the updated application, the data is pushed out to dynamic servers (B).

Page 15 of 21
This deployment mechanism also allows for graceful transitions to new versions of applications as
well. By enabling versioned web application deployment, a newly deployed version of the
application will only serve new sessions. All existing sessions will continue to be served by the
previous version of the application. Only when a session expires or is explicitly invalidated will an
active user be directed to the new version.
DISTRIBUTED GIT
REPOSITORY
The application files that an
administrator sends to the triad are
stored persistently in a distributed
Git repository. Junio Hamano and
Linus Torvalds developed the Git
repository format as a fast, open
source, distributed version control
system. Resin 4.0 uses the Git
repository format on each triad and
cluster server to store application
and configuration files. The unique
properties of the Git repository
provide for some benefits such as:
Transactional updates
If an update does not succeed or is
interrupted, the new application or
configuration files are not
distributed to either triad members
or dynamic servers. Only when the
files are verified to be correct on
each server are they made live.


Page 16 of 21
High concurrency
The Git repository format ensures that data is written in isolation, but accessible by any number of
readers. Thus when updating application files, the chance of corrupting files by multiple writers is
virtually zero, resulting in faster performance.
Durability
By writing application and configuration files to disk on all three triad servers as well as the dynamic
servers, the application can survive numerous failures in the cluster.
Incremental updates
By examining all of the files in an application individually, Resin 4.0 uses the version control features of
the Git repository so that only new files are sent over the network, thus reducing deployment time,
network traffic, and storage requirements.
The Git repository uses secure hashing (SHA) to store files according to their contents. When a new
file is added to the repository, its contents are hashed and compared to the hashes of files already
stored in the system. If this is the first time the contents of the file have been stored in the repository, a
new entry is made and the file is stored. If a file with the same contents has been stored in the
repository before, no additional space will be allocated. When directories are stored in the repository,
they are stored as trees whose hash are the contents of the directory. Thus file names are not used
directly.
This approach means that if any two attempts are made to store two files with different contents
under the same name, they cannot conflict. Thus there is no danger of race conditions to write the
same file twice. When the files are read, their contents are also checked against the value of the hash
under which they were indexed. By using this algorithm, no partial or corrupted values can be written
into the repository either. Finally, the hash-based storage of files means that any files that are
unchanged between two versions of an application are not stored twice, nor do they require
retransmission over the network.
Using this repository structure, Resin 4.0 replicates application files across the triad and dynamic
servers. When a web application archive (.WAR file) is deployed to the system, the triad member that
receives it will examine the files contained within and store them individually into the Git repository. If
these files are a new version of an existing application, only the new and updated files will be stored.
When the triad member contacts other triad members or dynamic servers, they compare only the

Page 17 of 21
hash values in their repositories first. Only files whose hash values do not already exist in the server
are sent over the network to be updated.
USE CASES
Setting up a basic Resin 4.0 cluster
A cluster containing a single pod is the easiest to set up for an application. For many organizations
this scenario will be sufficient to deploy, run, and scale their applications. Using a single pod is useful
in more traditional situations where the organization controls all of the deployment servers, but wants
the flexibility to add and remove servers without downtime or reconfiguration of the application
server.
The configuration of the cluster starts by explicitly identifying the triad servers in the Resin 4.0
configuration file. This file is then installed manually on the triad servers and Resin 4.0 servers can
then be started. At this point, the triad is now able to accept new dynamic servers into the cluster.
Resin 4.0.24 added the ability to configure triad members via standalone properties file. This file can
be a local file or located at any URL. IaaS environments like Amazon EC2, Xen Cloud, Eucalyptus,
OpenStack and CloudStack allow you to store private server instance data inside of a REST service,
which is part of the IaaS environment. This REST meta-data is only available for that server, i.e., it is
private. This is a good place to configure the triad members. It is a good idea to create a Resin image in
your IaaS environment, and then the last bit of configuration is in the properties file. This prevents a
lot of copying and pasting between server boxes. Resin 4.0.24 also add the ability to deploy resin
configuration snippets and libraries like JDBC drivers so you can rapidly configure a set of application
servers. Resin does not need to run in an IaaS environment, but it was designed to take full advantage
of an IaaS environment.
Next the administrator starts a new virtual machine from an instance with the triad member
preconfigured. If the server that is getting started is a spoke server then, it will not be listed in the
Triad list. For security the spoke server would need the same system key or the same users and
password preconfigured. On start up of this virtual machine, the operating system starts an instance of
Resin 4.0.
The new spoke server contacts the triad server to join the cluster and download its configuration and
applications. There is no need to preregister with the triad earlier. The triad accepts the new spoke
server into the pod if its credentials match. Once the spoke server authenticates itself, the triad will

Page 18 of 21
then send the configuration and application files. So far no application files have been deployed, so
only the configuration will be sent at this time.
Next the administrator deploys an application to the triad. The deployment tool adds the application
to the triad's Git repositories and the triad then pushes it to any spoke servers that are already part of
the cluster. Should any new spoke servers join after the applications are deployed, they will receive
the files when they register with the triad. Updates to the application files and new application
versions will also be pushed out to any spoke servers that are part of the cluster when they are
deployed.
The above sounds complicated, but Resin takes care of all of the details. Working with a single server is
the same as working with a cluster with one pod and that pod has one server in it. All of the command
line tools, web admin, etc. work the same. With Resin configuring 64 servers is nearly identical to
configuring to one server. Then to add more then 64 servers take a bit more work, but the
configuration of everything is as simple as it can be.
Dynamically Re-provisioning Resources among Applications using Multiple
Clusters
Each cluster in a Resin 4.0 deployment should serve the same set of applications. For example,
consider a commerce site that offers a searchable online catalog of goods and a shopping cart to
check out and pay for the goods. In this example, a single cluster may serve all of the applications
related to checking out and making payments. The applications that deal with browsing the site and
searching are grouped together in a separate cluster.
Depending on the current load of the applications and the needs of the users, different applications
may need different levels of resources. For example, during a normal week most customers browse
the catalog section of the site and only a small portion of those visiting the site end up making a
purchase. The administrator may decide to dedicate a cluster of 8 machines to the browsing
applications, but commit only 5 servers to checkout. During a sale however, the checkout traffic can
grow much greater in comparison to the browsing traffic. Based on the load seen by the servers, the
administrator may choose to allocate 6 servers to checkout and 7 servers to browsing.
To reallocate the resources, the administrator would shut down 1 of the spoke servers in the browsing
cluster. The other servers in the cluster remain available and can take over the sessions that were
being served by that spoke server because the triad has copies of all the session data. The
administrator then starts a new spoke Resin 4.0 server in place of the ones just shut down, but now
adds it to the checkout cluster either via the web admin tool or by specifying the correct properties in

Page 19 of 21
the user meta-data area of the server. The checkout cluster's triad sends the application files so that
the spoke servers can now handle additional checkout traffic. Using virtual machines that are
preconfigured to act as either checkout cluster servers or browsing cluster servers can make the
process even easier.
With a modern IaaS system like Amazon EC2, Eucalyptus or OpenStack this could be as easy as running
a few command line tools and can happen in mere moments.

Cloud Hosting Providers
A new industry of elastic cloud ISPs is emerging in which an organization can run a virtual machine on
the ISP's hardware at a metered rate. The administrator of an organization creates a virtual machine
image which he or she then uploads to the ISP and starts via a web console or web service. Resin 4.0
can take advantage of this type of infrastructure to give organizations the ability to add new server
capacity at will. Many organizations have only seasonal needs for additional computing capacity due
Above is an example of reallocating resources from one cluster to another. Normally, the checkout cluster
load is smaller so the administrator only allocates 5 servers (A). During a sale, the administrator moves a
server from the browsing cluster to the checkout cluster (B).
The above is an example of staring a Resin instance from a Resin Amazon Image (ami-1b814f72) and passing it user
data (user-data.properties) that will be available via a hidden REST meta-data server for the image. The user data file is
a properties file that contains the Triad topology as well as credentials to join the pod cluster.2


Page 20 of 21
to sales, deadlines, or other scheduled events. These organizations can maintain smaller fixed server
pools for their normal traffic level, but use the ISPs for the planned periods of higher traffic.
Alternatively, some of these ISPs allow flexible capacity for unplanned traffic spikes as well.
Take as an example the case where an organization has a set of servers that it maintains full time but
wants to add capacity during high traffic events. The administrator would configure a Resin 4.0 cluster
with two pods, one for the internal servers and one to be run on the ISP. The internal pod runs full
time, but the ISP pod may remain shut off or have few dynamic servers most of the time. When a high
traffic event occurs, the administrator starts the ISP pod and adds new dynamic servers to it to allow
for additional capacity. The servers in the each pod only have to consult their triad for most
operations, leading to fast access to the cache and sessions. To ensure security, cluster pods can be
configured to send encrypted and signed messages when in an environment such as a cloud ISP.

A single Resin 4.0 cluster deployed both to an internal network and a cloud ISP. Each network contains one pod
of the cluster and the pods are connected to allow sharing of cache and application file data.

Page 21 of 21
CONCLUSION
Resin 4.0 adds a number of features to enable cloud computing for web applications written in Java
and PHP. While this paper describes the mechanisms behind these features such as distributed
caching and cluster-wide application deployment, the developer does not have to tailor any code to
Resin 4.0 and may continue to create standard Java EE or PHP applications. These features simply
improve the capacity, reliability, and availability of those applications. Cluster-wide application
deployment and dynamic clustering make the task of maintaining virtualized deployments much
simpler for administrators by providing easy deployment and scaling tools. By offering these features,
Resin 4.0 provides a web application platform to exploit the full capabilities of cloud computing.
ABOUT CAUCHO TECHNOLOGY
Caucho’s relentless quest for performance and reliability paved the way for Resin® to become one of
the leading open source Java application servers since 1998. Our engineers' dedication to the
development, support and evolution of the Resin Java EE 6 Web Profile continues to uphold our
reputation for quality, performance and manageability. We’ve helped over 4.7 million web sites
worldwide including start-ups, governments and Fortune 500 companies build and grow their
business with one of the most flexible, rock-solid and powerful application servers, Resin. Caucho is an
Oracle Java EE licensee focusing on Web Profile and Cloud solutions. Our offices are located in San
Diego and San Francisco, California.

TABLE OF CONTENTS
Abstract .................................................................................................................................... 1 Table of Contents..................................................................................................................... 2 About Resin .............................................................................................................................. 3 Introduction ............................................................................................................................. 3 Resin 4.0 Architecture ............................................................................................................. 4 Large Clustering ...................................................................................................................... 6 Distributed Caching and Sessions.......................................................................................... 7 Implementing Resin 4.0's distributed cache ......................................................................... 8 Customizing Cache Configuration .......................................................................................10 Resin 4.0 Load Balancer ........................................................................................................12 Accessing the Distributed Cache from PHP applications ...................................................13 Cloud-wide Application Replication and Deployment.......................................................14 Distributed Git Repository....................................................................................................15 Transactional updates ........................................................................................................................... 15 High concurrency.................................................................................................................................... 16 Durability.................................................................................................................................................... 16 Incremental updates .............................................................................................................................. 16 Use cases ................................................................................................................................17 Setting up a basic Resin 4.0 cluster................................................................................................... 17 Dynamically Re-provisioning Resources among Applications using Multiple Clusters 18 Cloud Hosting Providers....................................................................................................................... 19 Conclusion..............................................................................................................................21 About Caucho Technology....................................................................................................21

Page 2 of 21

g. organizations can save money and energy during low-demand periods.ABOUT RESIN Resin 4. By having additional capacity available for times of high-demand. To deploy an application to a set of virtualized servers. object caching. An alternative scenario is to contract a third-party cloud computing ISP (cloud hosting providers) in which the organization provides virtual machine images to the ISP to run. One example of a cloud-computing environment is a cluster of physical machines maintained in-house by an organization. universities and government institutions may have a low level of activity on their sites during most of the year. or registrations such as accountancies.. By using virtual machines on a number of physical machines. At the same time. it must Page 3 of 21 . These physical machines all run virtual machines. the organization can better serve their clients. Another example is when a news sites experience an unexpected increase load after a breaking news event. by avoiding the use of unnecessary compute power or by being able to redistribute that power to other areas.0. Java EE Web Profile certified Java Application Server. If the application is stored on a networked storage device. New machine instances can be obtained from the ISP using a metered payment plan (e. the organization can improve reliability and performance as well as dynamically provision the appropriate resources to various applications depending on demand. HTTP caching and clustering/cloud support. full web server with URL rewriting. Amazon EC2). Java and PHP web applications can be difficult to maintain and deploy in such cloud environments with existing technology. If the application code is bundled in the virtual machine image. messaging.25. but experience huge load in the days and weeks leading up to the deadlines. INTRODUCTION Cloud computing is an environment in which computing hardware can be dynamically reapportioned to the task at hand. load balancing. applications. a new image must be constructed and distributed with each new version of the application. For example. which create the environment for their own applications to run. agencies that have regular deadlines for payments. Both of these scenarios present a great opportunity to organizations in which the demands of its applications change either periodically or sporadically. usually using virtual machines. This approach allows organizations to gain computing power on demand without the need to maintain hardware. offers unprecedented support for deploying and scaling Java web applications in a cloud environment. administrators need to create custom virtual machine images and distribute them to the each server. PHP running on the JVM. Resin supports Java EE Web Profile APIs (like CDI).

Spoke servers (right) can be brought into the system at any time and removed at will. Apache Ant. which means losing the reliability and availability advantages of replication. Memcached wire protocol and the PHP APC API. Object caching would require a sophisticated third-party cache framework that is fast and able to cope with the dynamic nature of a cloud environment.0 addresses all of these issues with dynamic session distribution. The application will automatically be propagated to all server instances and when new instances are added. Resin 4. In the absence of application server support.0 architecture is the triad. cached objects. they will be brought up-to-date by the same mechanism. a set of three servers that provide the central repository for persistent data and maintain an up-to-date record of the dynamic servers in the system.0 employs to make cluster-wide caching and deployment fast are also detailed. Finally. Page 4 of 21 . Resin Web Admin.be retrieved by each virtual machine and subsequent deployment updates must be carefully managed. Resin CLI. and application files as servers are added and removed from service. sessions will not be replicated across all machines. messaging queues. which use standard APIs. or application data. Distributed sessions are transparent to both Java and PHP applications. Administrators are then also required to maintain an additional infrastructure exclusively for caching.0 scales in production environments. and Eclipse. The loss or shutdown of server instances will not result in the loss of sessions. Optimizations allow for quick access to data at any server in the system while the triad provides a point of stability and persistence to reduce management complexity. The triad servers (left) are the three core servers in a Resin 4.0 that makes these features possible. RESIN 4. Applications can be distributed to all servers in the deployment via provided plug-ins for tools such as REST. use cases and example deployment scenarios are presented to give a flavor of how Resin 4. The applications themselves must also adhere to certain restrictions.0 ARCHITECTURE The core part of the Resin 4. Maven. Thus the applications may be forced to avoid sessions and code to non-standard APIs for caching.0 deployment. Object caching is available to applications via the standard Java Cache (JSR-107) API. cached objects. This paper describes the architecture of Resin 4. The optimizations that Resin 4.

The triad server keeps track of the current members of the cluster and communicates with the load balancer to update it on which dynamic servers are available to handle requests.0 includes a software load balancer that distributes HTTP requests for web application clients. subsequent requests from the same client will go to the same server to avoid unnecessary load times. the cost of starting a new spoke server is simply to start a new virtual machine.Resin clustering is a classic spoke and hub architecture where the hub is the triad and the spokes are the spoke servers. optimizations keep frequently used data in memory on the dynamic servers to improve application performance and reduce network load. if one or more triad servers become faulty at any time. in the system are the workhorses for applications. The spoke servers use the triad as their persistent store for session and object cache data. allows up to two servers to fail at any given time. Thus these three servers can be the main focus of administration time and effort because all of the other servers may go up or down at any time without affecting the functional performance visible to clients. and allows normal maintenance of a single server without downtime. Having three servers in a triad avoids a single point of failure. a replacement or replacements can be brought into place quickly. the changes are pushed out to all the dynamic servers by the triad to keep them updated. the load balancer takes into account the CPU load of a server as well as the number of simultaneous requests that server is already handling. Once a server has been selected for a new request. At the same time. Once properly configured. Page 5 of 21 . Resin 4. dynamic/elastic servers. Together with the triad servers. Each dynamic server has access to the shared data within the system via Java EE sessions or the Java Cache API. it contacts one of the triad servers to announce its availability and import all application data. Started and stopped at will. Assuming a virtualized environment. As applications are updated on the triad. When choosing a server to handle new requests. Using a combination of triad servers and spoke servers minimizes the complexity of managing an application deployment. Depending on the algorithm selected by the administrator. they provide elastic scaling. the load balancer can either direct the request to the least loaded server to keep load even or to the same set of servers until they are fully loaded to avoid starting new servers. When a new dynamic server is brought online. The spoke servers. The triad servers are brought up first on system start up and at least one should be available at any time during the life of the system. the spoke servers form a cluster.

When configuring Resin 4. Thus cache data and sessions can be shared across the entire network.0. Resin 4. Resin load balancer is faster than Apache HTTPD and has performance on par with nginx. all the pods are considered to be part of a single logical cluster serving the same set of applications. and manageability. many organizations split their servers into different networks to improve reliability. the administrator selects one of the pods to be a master pod. At the same time. which Resin 4. Moreover any applications that the administrator deploys will be propagated throughout all the pods.0 clusters may have multiple pods to allow administrators to align their manages the object cache. distributed sessions. A major advantage of using Resin’s included load balancer is that it is cluster aware. Page 6 of 21 . message queues. it can automatically start taking on load. Thus if an application needs data not in its local pod. This pod maintains the authoritative copy of the data that all the other pods use for caching and applications. When a server is added to the cluster.Using Resin’s included load balancer is not a requirement. This architecture makes it possible to create a pod for each local network within a single site or a pod for each of a number of geographically distributed sites. Apache HTTPD or nginx. LARGE CLUSTERING As deployment sizes grow. application clustering with their network partitioning for improved performance. and application file repositories for a set of spoke servers. The Resin load balancer and the Resin Web Server are easy to configure and integrate nicely with the rest of Resin. it is easy to find by asking the master pod's triad. Other benefits include HTTP proxy caching and URL rewriting. Each cluster pod contains its own triad. the dynamic servers need only to coordinate with their local triad to access cache. and application file data for fast retrieval and low network overhead. availability.0 clustering supports this partitioning explicitly by allowing a cluster to be split into pods. In either case. session. even at remote sites. Some organizations may prefer to use a hardware load balancers.

Page 7 of 21 .0 introduces a new distributed caching architecture to support object caching and HTTP sessions for applications. The container injects the Cache object. unified framework to handle both. allowing arbitrary scaling. Both object caching and distributed sessions are simply types of caching with different synchronization requirements. From the point of view of the application developer. the behavior will be the same and the same data will be accessed. which retains the session data without interrupting the user's experience. while the administrator is given the flexibility to add and remove server capacity at will.0 uses a single. but that server later fails. When an application running on one server caches a value. Distributed sessions can be handled in a similar way to improve reliability without developer intervention. these facilities should be used in the same way regardless of the size of the cluster.0 cluster. Thus the developer is freed from ongoing infrastructure concerns. The next section will describe the general framework and how each API is built using it. Above is an example of using the Java Cache API. a distributed caching architecture is necessary.DISTRIBUTED CACHING AND SESSIONS Resin 4. However as far as the developer is concerned. Object caching is visible to applications via the Java Cache (JSR-107) API while HTTP sessions are part of the basic Java EE Servlet specification. The load balancer can then redirect the user on the next request to another server running the same application. he or she needs only to use method calls to get and put data without needing to know where the data is actually stored. Resin 4. Suppose that an application on a server starts a new session with a user. Regardless of which server this code is run on in a Resin 4. To make caching and sessions truly transparent to the developer. that same value should be available to applications running on another server so that they can all take advantage of the same cached data.

The most common pattern of using a cache in web applications is to store data that is read more often than it is written. the expense of sending redundant cached data is avoided. most recent version. but the triad always has the master copy. The metadata used for synchronization and the actual cached data are distributed separately. it is considered to have the authoritative.IMPLEMENTING RESIN 4. To improve synchronization performance. If it does. The key hash determines the triad owner so finding an m-node is easy and unambiguous. The spoke servers send the key to the triad and ask the triad to verify if the requested spoke severs have the latest copy of the object.0 highly efficient in managing the cache.0 uses a fixed size metadata structure called an m-node that includes hashes of both the cache key and the cache data as well as a version number used for synchronization purposes. If it does have the latest version of the object. By decoupling the storage of the metadata from the data. then the triad will send the latest version of the object. While there is only one owner. Spoke servers can have local copies of items. the m-nodes are replicated on all the triad servers for redundancy in the case of failure or maintenance. If a triad server owns an m-node. The spoke servers have a key that behaves like a version key and has a unique identifier. but only one triad server “owns” the management of any individual m-node.0's fixed size m-node structure for the distributed cache Each server in a cluster maintains a collection of m-nodes. This key innovation makes Resin 4. At a high level. Resin 4. This system is efficient because large objects are not copied around unless they are needed. Resin 4. Page 8 of 21 . the structure of the cache can be viewed as a collection of m-nodes.0 is able to avoid unnecessary network communication because only the metadata needs to be compared when checking for updates to the cache. A cache can be viewed as a map from a key to a value. Cache data is distributed in a similar way. then it can use the local copy in the cache.0'S DISTRIBUTED CACHE In Resin 4. Resin 4. The following is an accurate detailed account on how this is accomplished. Thus real updates to cached data are often relatively infrequent and so by sending only metadata during synchronization.0 the triad servers handle the synchronization and storage of cached data.

Applications running on Resin 4. the Resin 4. The same triad server or separate ones may own the m-node and the data. The triad servers persist the cache data to Resin optimized database so that recovery from transient errors is fast. the server hashes the cache data to find its owner then sends the data itself to its own triad server. The server that is performing the update does not have to wait for the triad to replicate or persist the data. The server computes the m-node's owner using this key hash and then sends the new m-node to the triad server. only replication. When using one of these facilities.except that because there is no version associated with the data value itself (versions are only a part of the m-node).0 server that received the web request interacts with the distributed cache by communicating with the triad servers in the cluster. Page 9 of 21 . no synchronization is necessary. the server hashes the key using the SHA-256 algorithm. The process to add a new cache entry is identical. Similarly. this process is performed asynchronously. they transmit the data to the other triad servers for redundancy. Once the server or servers receive the update. An application running on a dynamic server updates a cache entry by contacting the triad servers.0 may use the cache for a variety of reasons such as to cache an object or store session data. To update a value in the cache.

which owns the m-node. When the m-node owner server responds to the requesting server. Because the data is replicated on all the servers. each cache has a variety of timeout values and other configuration options that customize the behavior for individual applications. When the application tries to obtain a value from the cache. The application would like the list to be updated at least every minute to make sure that new events are shown to the users. CUSTOMIZING CACHE CONFIGURATION Multiple caches can be configured within each server for different applications and different requirements for each cache. The triad server looks up the m-node and inspects the value hash to find the actual data. The key hash determines the triad server. the triad server that owns the m-node also has a copy of the value data. Page 10 of 21 .An application on a dynamic server retrieves a cache entry for the first time by contacting the triad. Specifically. Setting expire timeout to 1 minute would achieve that goal. it includes both the m-node and the associated data. an application might cache certain computations based on a database query such as a list of current events. The server then contacts the owner’s triad server to request the m-node. the server first hashes the key that the application requested. For example.

Timeout Expire timeout Description The maximum time without an update that an item is considered valid. Specifically. Resin 4. One requirement is that sessions be able to expire after a certain amount of time. The maximum time without an update or a read that an item is considered valid.0's load balancer implements sticky sessions for this reason. Sessions are used by applications in a very predictable way. By setting an idle timeout on the cache storing the session data. this is how long each lease is issued. These values apply to all entries contained within a particular cache. Page 11 of 21 . Load balancers can force this behavior by implementing sticky sessions. so optimizations can be made in the network infrastructure and the cache to improve performance. The cache can also be optimized to take advantage of sticky sessions. However in this case. it can be given a non-exclusive lease on the session data. If the user interacts with the application using the same server for the duration of the session. Default Infinite Local read timeout 10ms Idle timeout Infinite Lease timeout 5min HTTP sessions are another interface to the distributed cache. How long a value is used locally before checking with the triad for updates. In the normal case that a single server is the only one to read and write a session. If items are leased with this cache. Configuration options for various distributed cache timeouts. a session is considered live when it is updated as well as when it is simply accessed. Typically used for sessions. that session data can be kept in local memory for fast access. Typically used for sessions. the sessions will timeout properly. However. HTTP session data has different requirements than most object cache data. a way of routing requests from the same user to the same server during a single session. session data can be stored in the cache under a key using the session ID.

but reliability and fail-over are maintained if a server fails. Page 12 of 21 . so the triad member that owns the entry's m-node will issue a timed lease on the m-node to the server handling the first request in the session.0 includes a software load balancer that can distribute requests to servers in a cluster. those changes are sent to the lease-holding server by the triad as they are made. the cloud topology. The load balancer can distribute requests using a number of algorithms: •• •• •• Round robin Server load Green load balancing Round robin is the simplest algorithm for load balancing in which the next request is sent to the next server on the list. To maintain reliability. Other servers may also access and update the session. with the application cluster sending updates to the load balancer cluster each time a dynamic server is added or removed. Until the lease expires. the access to session data is much faster. the lease-holding server still writes all of its changes back to the triad. in a special distributed cache entry. If any server other than the leaseholder makes any changes. When one of the load balancer servers distributes requests. so if new servers are added or removed the load balancer will follow that order. The triad servers maintain the list of servers and their order. Resin 4.0 LOAD BALANCER Resin 4. it simply looks in the distributed cache to see what application cluster servers are available. potentially on a number of servers. This load balancer is simply a Java EE application. Thus in the normal case.The session itself is stored as a cache entry. RESIN 4. that server can assume that its local copy of the cache is up to date and therefore never needs to use the network to check the validity of its copy with the triad. so it runs in its own separate cluster.0 clusters maintain a list of the dynamic servers available. The load balancer cluster acts as a read-only client of the application cluster's distributed cache.

0 includes Caucho Technology's implementation of PHP called Quercus. PHP pages are executed in a fashion similar to JSPs. except that instead of directing the next request to the least loaded server in the cluster. The above is an example load balancer scenario with two tiers and multiple load balancers. ACCESSING THE DISTRIBUTED CACHE FROM PHP APPLICATIONS Resin 4. then assigns the next request to that server. Green load balancing uses the same measurements as the server load algorithm. PHP applications can share cached data with Java applications running within the same cluster. PHP contains library functions to implement HTTP sessions as well as object caching. In addition to these measured values. Thus servers that are not necessary during periods of low load can be placed in low power mode to preserve energy and reduce wear. the number of active connections that the server is handling at the moment along with the CPU load of the server is included in a formula to calculate the server load.0's distributed cache. This implementation is written in pure Java and runs within the Java EE Servlet framework. By implementing these functions using Resin 4. Page 13 of 21 . PHP applications can also take advantage of the benefits of the architecture. This algorithm also takes into account startup time for a server and allows it time to “warm up” so that the applications are ready to handle requests. The application cluster’s triad pushes the current state of the application cluster to the load balance cluster. This approach means that only the number of servers that are needed to handle the current load will be used. Moreover.The server load algorithm uses a set of empirical data to determine which server in the cluster is the least loaded. Specifically. servers may also be weighted explicitly to direct more or less load. this algorithm tries to load a single server up to a threshold level before moving to the next server.

0 such as Resin REST support. Page 14 of 21 . they update the dynamic servers by pushing out the new data. PHP also includes an API called APC to provide object caching which is used widely in both custom and open source PHP applications such as MediaWiki. Once all of the dynamic servers are updated. the administrator can then repeat this process. When all of the triad servers have the updated application. Maven. Resin Web Admin. it will not start. However because most application updates include many of the same files. Drupal. avoiding the need to write tedious scripts. These deployment tools distribute a Web Archive file (. Once an application is deployed to one of the triad members. so if a server in the cluster does not receive the application successfully. the new application can be started either automatically or manually. WordPress. The applications are deployed using a transactional protocol. it contacts the triad and downloads all the applications currently deployed to the cluster.0 distributed cache directly.WAR) to the entire cluster. These functions are also implemented in Quercus using the Resin 4. This approach is known as an incremental update. these tools contact one of the cluster triad members which will then distribute the contents of the web application to the other two triad members. APC provides a basic key-value storage mechanism with timeouts much like the Java Cache API. When all of the triad members have replicated the application files.0 sends only the different files to the other triad and dynamic servers. it is replicated across the other two (A).The session API in PHP is straightforward and backed by the distributed cache much in the same style of Java Servlet sessions. Resin 4. the data is pushed out to dynamic servers (B). CLOUD-WIDE APPLICATION REPLICATION AND DEPLOYMENT Administrators and developers can deploy applications to the cluster using a number of tools supported by Resin 4. Resin CLI. and many more. Then when deploying an application. or Eclipse. When a new spoke server is added to the cluster. If the application is updated. Ant.

open source. Only when a session expires or is explicitly invalidated will an active user be directed to the new version. DISTRIBUTED GIT REPOSITORY The application files that an administrator sends to the triad are stored persistently in a distributed Git repository. distributed version control system. the new application or configuration files are not distributed to either triad members or dynamic servers. Junio Hamano and Linus Torvalds developed the Git repository format as a fast. The unique properties of the Git repository provide for some benefits such as: Transactional updates If an update does not succeed or is interrupted.0 uses the Git repository format on each triad and cluster server to store application and configuration files. Resin 4. Only when the files are verified to be correct on each server are they made live. a newly deployed version of the application will only serve new sessions.This deployment mechanism also allows for graceful transitions to new versions of applications as well. By enabling versioned web application deployment. All existing sessions will continue to be served by the previous version of the application. Page 15 of 21 .

When the files are read. Thus there is no danger of race conditions to write the same file twice. only the new and updated files will be stored. Resin 4. and storage requirements. the triad member that receives it will examine the files contained within and store them individually into the Git repository. When the triad member contacts other triad members or dynamic servers. Using this repository structure. their contents are also checked against the value of the hash under which they were indexed. they cannot conflict. If this is the first time the contents of the file have been stored in the repository. the chance of corrupting files by multiple writers is virtually zero. Incremental updates By examining all of the files in an application individually. nor do they require retransmission over the network. its contents are hashed and compared to the hashes of files already stored in the system. they compare only the Page 16 of 21 . Finally. This approach means that if any two attempts are made to store two files with different contents under the same name. resulting in faster performance. no partial or corrupted values can be written into the repository either. the application can survive numerous failures in the cluster. When a web application archive (.High concurrency The Git repository format ensures that data is written in isolation. the hash-based storage of files means that any files that are unchanged between two versions of an application are not stored twice. When directories are stored in the repository. but accessible by any number of readers. The Git repository uses secure hashing (SHA) to store files according to their contents. network traffic. By using this algorithm.0 uses the version control features of the Git repository so that only new files are sent over the network. When a new file is added to the repository. thus reducing deployment time. Thus file names are not used directly. a new entry is made and the file is stored. If these files are a new version of an existing application. no additional space will be allocated.WAR file) is deployed to the system. If a file with the same contents has been stored in the repository before. Resin 4. they are stored as trees whose hash are the contents of the directory. Durability By writing application and configuration files to disk on all three triad servers as well as the dynamic servers. Thus when updating application files.0 replicates application files across the triad and dynamic servers.

0. but it was designed to take full advantage of an IaaS environment. it is private.24 also add the ability to deploy resin configuration snippets and libraries like JDBC drivers so you can rapidly configure a set of application servers.24 added the ability to configure triad members via standalone properties file.0 cluster A cluster containing a single pod is the easiest to set up for an application. This is a good place to configure the triad members. For many organizations this scenario will be sufficient to deploy. It is a good idea to create a Resin image in your IaaS environment. IaaS environments like Amazon EC2. i. Next the administrator starts a new virtual machine from an instance with the triad member preconfigured. Eucalyptus.hash values in their repositories first. For security the spoke server would need the same system key or the same users and password preconfigured. OpenStack and CloudStack allow you to store private server instance data inside of a REST service.. but wants the flexibility to add and remove servers without downtime or reconfiguration of the application server. This file can be a local file or located at any URL.e. Using a single pod is useful in more traditional situations where the organization controls all of the deployment servers. it will not be listed in the Triad list. Resin 4. and then the last bit of configuration is in the properties file. This file is then installed manually on the triad servers and Resin 4. USE CASES Setting up a basic Resin 4. The new spoke server contacts the triad server to join the cluster and download its configuration and applications.0 servers can then be started. Only files whose hash values do not already exist in the server are sent over the network to be updated.0. run. This REST meta-data is only available for that server. The configuration of the cluster starts by explicitly identifying the triad servers in the Resin 4. Xen Cloud. the triad is now able to accept new dynamic servers into the cluster. which is part of the IaaS environment. There is no need to preregister with the triad earlier. On start up of this virtual machine. This prevents a lot of copying and pasting between server boxes. Resin does not need to run in an IaaS environment. Once the spoke server authenticates itself.0. Resin 4. If the server that is getting started is a spoke server then. At this point.0 configuration file. and scale their applications. the triad will Page 17 of 21 . the operating system starts an instance of Resin 4. The triad accepts the new spoke server into the pod if its credentials match.

Dynamically Re-provisioning Resources among Applications using Multiple Clusters Each cluster in a Resin 4. the checkout traffic can grow much greater in comparison to the browsing traffic. etc. the administrator would shut down 1 of the spoke servers in the browsing cluster. during a normal week most customers browse the catalog section of the site and only a small portion of those visiting the site end up making a purchase. The above sounds complicated. In this example.0 deployment should serve the same set of applications. but the configuration of everything is as simple as it can be. consider a commerce site that offers a searchable online catalog of goods and a shopping cart to check out and pay for the goods. Based on the load seen by the servers. So far no application files have been deployed. For example. but now adds it to the checkout cluster either via the web admin tool or by specifying the correct properties in Page 18 of 21 . Then to add more then 64 servers take a bit more work. Updates to the application files and new application versions will also be pushed out to any spoke servers that are part of the cluster when they are deployed. The other servers in the cluster remain available and can take over the sessions that were being served by that spoke server because the triad has copies of all the session data. Working with a single server is the same as working with a cluster with one pod and that pod has one server in it. different applications may need different levels of resources. work the same. The administrator then starts a new spoke Resin 4. so only the configuration will be sent at this time. For example. All of the command line tools. Next the administrator deploys an application to the triad.then send the configuration and application files. a single cluster may serve all of the applications related to checking out and making payments. To reallocate the resources. Depending on the current load of the applications and the needs of the users.0 server in place of the ones just shut down. but Resin takes care of all of the details. but commit only 5 servers to checkout. With Resin configuring 64 servers is nearly identical to configuring to one server. During a sale however. the administrator may choose to allocate 6 servers to checkout and 7 servers to browsing. The administrator may decide to dedicate a cluster of 8 machines to the browsing applications. web admin. they will receive the files when they register with the triad. Should any new spoke servers join after the applications are deployed. The applications that deal with browsing the site and searching are grouped together in a separate cluster. The deployment tool adds the application to the triad's Git repositories and the triad then pushes it to any spoke servers that are already part of the cluster.

the user meta-data area of the server. The checkout cluster's triad sends the application files so that the spoke servers can now handle additional checkout traffic. With a modern IaaS system like Amazon EC2.properties) that will be available via a hidden REST meta-data server for the image.2 Cloud Hosting Providers A new industry of elastic cloud ISPs is emerging in which an organization can run a virtual machine on the ISP's hardware at a metered rate. the administrator moves a server from the browsing cluster to the checkout cluster (B). Eucalyptus or OpenStack this could be as easy as running a few command line tools and can happen in mere moments. The administrator of an organization creates a virtual machine image which he or she then uploads to the ISP and starts via a web console or web service. Resin 4. The user data file is a properties file that contains the Triad topology as well as credentials to join the pod cluster.Above is an example of reallocating resources from one cluster to another. During a sale. the checkout cluster load is smaller so the administrator only allocates 5 servers (A).0 can take advantage of this type of infrastructure to give organizations the ability to add new server capacity at will. Many organizations have only seasonal needs for additional computing capacity due Page 19 of 21 . The above is an example of staring a Resin instance from a Resin Amazon Image (ami-1b814f72) and passing it user data (user-data. Using virtual machines that are preconfigured to act as either checkout cluster servers or browsing cluster servers can make the process even easier. Normally.

Each network contains one pod of the cluster and the pods are connected to allow sharing of cache and application file data. cluster pods can be configured to send encrypted and signed messages when in an environment such as a cloud ISP. one for the internal servers and one to be run on the ISP. To ensure security. The administrator would configure a Resin 4. These organizations can maintain smaller fixed server pools for their normal traffic level. leading to fast access to the cache and sessions. When a high traffic event occurs. deadlines.to sales. or other scheduled events. Page 20 of 21 . but the ISP pod may remain shut off or have few dynamic servers most of the time. but use the ISPs for the planned periods of higher traffic. The internal pod runs full time. some of these ISPs allow flexible capacity for unplanned traffic spikes as well. The servers in the each pod only have to consult their triad for most operations.0 cluster with two pods. Alternatively. A single Resin 4. the administrator starts the ISP pod and adds new dynamic servers to it to allow for additional capacity.0 cluster deployed both to an internal network and a cloud ISP. Take as an example the case where an organization has a set of servers that it maintains full time but wants to add capacity during high traffic events.

rock-solid and powerful application servers. governments and Fortune 500 companies build and grow their business with one of the most flexible. ABOUT CAUCHO TECHNOLOGY Caucho’s relentless quest for performance and reliability paved the way for Resin® to become one of the leading open source Java application servers since 1998. Our offices are located in San Diego and San Francisco.7 million web sites worldwide including start-ups.0 provides a web application platform to exploit the full capabilities of cloud computing. Resin 4. Caucho is an Oracle Java EE licensee focusing on Web Profile and Cloud solutions. support and evolution of the Resin Java EE 6 Web Profile continues to uphold our reputation for quality. Our engineers' dedication to the development. These features simply improve the capacity. California. Resin. We’ve helped over 4. reliability. Cluster-wide application deployment and dynamic clustering make the task of maintaining virtualized deployments much simpler for administrators by providing easy deployment and scaling tools. performance and manageability. Page 21 of 21 .0 and may continue to create standard Java EE or PHP applications.0 adds a number of features to enable cloud computing for web applications written in Java and PHP. While this paper describes the mechanisms behind these features such as distributed caching and cluster-wide application deployment. the developer does not have to tailor any code to Resin 4.CONCLUSION Resin 4. By offering these features. and availability of those applications.

Sign up to vote on this title
UsefulNot useful