/  7
 
Applying Web Service and Windows Clustering for High Volume Risk Analysis
Sivadon Chaisiri, Juta Pichitlamken, Putchong Uthayopas, Thanapol Rojanapanpat, SuphachanPhakhawirotkul, and Theewara Vorakosit
 High Performance Computing and Networking Center, Faculty of Engineering, Kasetsart University, 50 Phaholyothin Road, Bangkok 10900, Thailand {g4665304, juta.p, pu, g4765402, g4565065, g4685034}@ku.ac.th
Abstract
We present the development of a distributed systemto calculate the Value at Risk (VaR) measure when alarge number of users are presented. A scalablearchitecture based on Windows clustering and web services is proposed. In addition, we develop a load balancing algorithm to distribute the workload among the compute nodes in the Windows cluster. Theexperimental results show that our system can substantially speed up the VaR calculation. Inaddition, it offers a good scalability. This work  provides an example of how to deploy a standard web service and Windows clustering technology to offer acost-effective and scalable solution for demanding  financial applications in practice.
1. Introduction
Many commercial organizations routinely performcomputing-intensive calculations. For example, creditcard companies employ data-mining techniques todetect credit card frauds. Commercial banks need tocompute value-at-risk measures to comply with theBasel Capital II Accord. These calculations generallyrequire a large amount of historical data and thus highcomputing power to analyze them. Deployment of such applications on a centralized server is one way toget the results, but it also has some drawbacks.In an actual commercial information system,software is usually implemented as a server-sideapplication that follows some logics, manipulates somedata and is hosted in only one centralized server.There may be many users and other applicationsrequesting the services that this server provides. If theapplications consume lots of computing power, theoverall performance of the server will be degraded andmaybe temporarily unavailable when there are toomany requests. This limitation depends on manyfactors, such as server’s specifications (e.g., capabilityof processors and amount of available memory) andother runtime software infrastructure (e.g., operatingsystem, application server software and databasesoftware). In addition, using one centralized server  poses a risk of a single point of failure; when the server fails, the whole system will be down too.To get more computing power, one can buy a newhigh-end server. However, the single point of failure problem still exists, and the total cost of ownership(TCO) also increases: not only the price of purchasinga new server but also the cost of upgrading,maintenance, service, and repair.Another high performance computing (HPC)solution is clustering. A cluster is a group of computers connected via a high speed network.Computers inside the cluster share resources such asstorage, data, software, and computing power. Whenan application is running on a cluster, it can harness theshared computing resources that act as a unified pool.An application can utilize this resources pool to obtainmuch faster speedup than possible on a single server.Furthermore, we can build a cluster which connectscommodity personal computers that are much lessexpensive than a high-end server. In addition, thesingle point of failure problem is avoided becausemany computers inside the cluster can provide thesame functionality; in the case that one computer fails,there are still others that are able to handle functions.In this paper, our cluster is built on the MicrosoftWindows platform, which we choose because theoperating system is robust and very easy to configure.The application that we test in this work is a real-worldapplication developed for the Thai Bond DealingCentre (ThaiBDC. See www.thaibdc.or.th). The
 
application calculates the Value-at-Risk (VaR) for a portfolio of stocks traded in the Stock Exchange of Thailand. VaR is an estimate, with a user-defineddegree of confidence, of how much one can loose fromone’s portfolio over a given time horizon [1]. It aimsat making a statement of the form “We are c percentcertain not to loose more than V dollars in the next mdays.” ThaiBDC has already provided the VaR calculation service commercially but using different platforms. Thus, this application will be usedconcurrently by a large number of users. Moreover,each request generally requires a lot of computing power because the calculation involves manipulationof large two-dimensional matrices.To achieve the cluster environment, we implementthe VaR application as a web service which isdeployed on every computer inside the cluster. Thismeans that the cluster is able to handle many incomingrequests simultaneously. In addition to the application,we also need a load-balancing mechanism to distributeusers’ requests to the computers inside the cluster.We propose architecture for a cluster with a load balancer that is designed to handle a large amount of requests. From the experiment, our cluster andsoftware can maintain a good response time while thenumber of request increases. We expect that ou proposed architecture can be efficiently applied for other business applications as well.This paper is organized as follows: Section 2 detailsrelated works. We describe our design in Section 3.The experimental results are in Section 4, and weconclude in Section 5.
2. Related work 
Currently, most applications are generally designedto run on a single processor. This is usually termedsequential programming. To efficiently utilize thecomputing resources shared in a cluster, applicationsmust be implemented using parallel programmingtechnique, where the processing is distributed onmultiple processors in the cluster. Additionally, the processors can communicate with each other andexchange some data. Widely used tools for building a parallel programming application are ApplicationProgramming Interfaces (APIs), such as MessagePassing Interface (MPI) [2][3] and Parallel VirtualMachine (PVM) [4]. A major drawback of  parallel/distributed programming is that it takes moretime and effort to design and implement applicationsthan sequential programming. However, moderndevelopment frameworks (e.g., J2EE [5] and .NETframework [6]) are appealing because they providemany tools and APIs that supports business applicationimplementations. By applying such a framework for  parallel program development, a programmer candevelop business application that is able to utilize the parallel programming power much faster than before.When we build a cluster for parallel programming,we generally choose a single operating system that provides the runtime infrastructure for resource sharingand communication. Although it is possible to build acluster with heterogeneous operating systems, havingonly one type of operating system will be muchsimpler to maintain. Two types of operating systemare widely used for clustering:1) Linux cluster: This type of cluster is based oncommodity hardware, open-source software and Linux.This type of cluster is also widely known as Beowulf Cluster [7]. NPACI Rocks and OSCAR [8] aresamples of software for building Linux clusters. Themain advantage of this type of cluster is the highlyconfigurable architecture, high scalability, and lowsoftware licensing cost. However, a skill in usingLinux operating system is also required for maintaining a cluster. Hence, this type of cluster isused mostly in technical and academic environmentwhere the required skill and manpower are availablecheaply.2) Windows Cluster: Microsoft Corporation provides a cluster solution via the Microsoft Windowsoperating system. Many pioneer works has been done by researchers from Cornell Theory Center (CTC)regarding how to build and use large scale Windowscluster [9]. Although building a Windows cluster means users have to pay for some software cost, it may be more suitable for commercial environment for anumber of reasons: First, many companies havealready had a lot of Windows-based expertise in-house.Second, building and maintaining a Windows cluster iseasier than a Linux one, with a rich set of GUI basedtools. Third, there are a rich set of robust and standardtools (such as Java Development Kit and MicrosoftVisual Studio) that can be used to enable programmersto quickly develop an application under thisenvironment. Finally, all the standard technology (e.g.,web services and Java) is very robust as it has beentested for years in mission-critical commercialenvironment.Currently, there is a rapidly increasing need for organizations to exchange information among eachother. This so-called business to business (B2B)communication allows much faster and highlyautomatic flow of information to speed up processing
 
in application, e.g., package tracking and customer relationship management (CRM). Hence, there is aneed for a standard technology that allows applicationsto communicate with each others over Internetregardless of the platform differences. Web service(WS) [10] is a highly standard technology invented byW3C for this purpose. There are a series of standardscover web service technology including area such asnaming, WS description language, WS messaging, WSsecurity. The advantage of web service is that it is awell support standard with a rich set of developmenttool and environment. In this work, web service is akey technology being applied to provide a parallel/distributed execution of VaR calculation.Although web service can be utilized as a commoncommunication infrastructure, parallel execution of theapplication still requires a mechanism to distributeapplication execution to each compute node inside thecluster. The balancing of the execution load amongcompute nodes is one of the determining factors of thesystem overall performance. For example, a poor load balancing algorithm may cause some computers to bear a large amount of loads while other computers maytake up small loads or be idle. Load balancing is afield of active research. The random balancing andround robin balancing algorithms are some examplesof well-known and widely-used methods.The random balancing algorithm randomly choosesa computer to handle a new incoming load. The appealof this algorithm is in its simplicity and efficiency on alarge cluster. However, it generally performs poorly ona small cluster and a seriously load unbalanced can becaused by a bad randomization algorithms. The roundrobin algorithm has a strategy to prevent starvation: asituation in which some computers are idle for a longtime. This algorithm has a circular queue of computers. The incoming loads will be distributed tothose computers sequentially and circularly. Althoughthe round-robin algorithm is efficient, but it may causesome computers to consume too much loads when thecluster consists of processors with different speed.Hence, we use an algorithm that can distribute theincoming load by distributing it to a computer that hasthe least load. We will describe our algorithm in moredetail in the next section.
3. Design and implementation
3.1. Risk analysis process
The process used in this work for the VaR calculation is as shown in Figure 1.
Figure 1. Scalable VaR calculation process
In the VaR calculation tool, users input the portfolioinformation: his portfolio detail, the holding period,and confidence level, through a user interface module.In this work, there are two user interface modules: Oneis a web application developed using ASP.NET, andthe other is an Excel worksheet that connects to VaR calculation system using web service. After thisinformation becomes available, the VaR calculationstarts. The algorithms used for VaR calculation is
Single-Factor Capital Asset Pricing Model (CAPM)
(cf. [11]). During the VaR calculation, the pasthistorical data from the Stock Exchange of Thailand(SET) such as daily SET index, closing prices areneeded. This data is available in a local database keptat ThaiBDC sites. The updated version of the data istransfer daily from SET to ThaiBDC.
3.2. System architecture
The architecture of our system is depicted in Figure
2
. The system consists of 3 components: a front-endnode, a compute node, and a client node.
 Front-end node:
A front-end node performs manyfunctions. First, it acts as a security and access controlfor the cluster. Second, it acts as a web portal and webserver for web based applications that use the cluster.Finally, front-end node also acts as a load balancer for the VaR application. The load balancer is a servicethat balances the incoming requests to compute nodes.
Compute Node

Share & Embed

More from this user

Add a Comment

Characters: ...

gopi09left a comment

can any one send this to my mail maram.vinodkumar9060@gmail.com