Polish-Japanese Institute of Information Technology Chair of Software Engineering

Master of Science Thesis

Design patterns in application integration based on messages
by

Mariusz Pikula, Adam Siemion

Supervisor: Dr Piotr Habela

Warsaw, 2007

Abstract
This thesis is devoted to the issues connected with application integration. It presents the importance of the knowledge of the design patterns connected with this subject. At the beginning this thesis introduces the reader to the area of application integration and different types of problems connected with it. The thesis explains why the task of integration is being undertaken and why it can be very difficult and complicated. Next, the integration styles are being presented starting with the oldest and the simplest ones going through more complex ones and ending on the integration based on messages, which forms the main area of interest in this thesis. After familiarising the reader with the basics of this integration approach the thesis is aimed to provide him/her with the essential theoretical background, which covers knowledge about basic terms and design patterns connected with the integration based on messages. Afterwards, the practical application of the discussed terms is being shown based on the case study describing system integration issues. Last part of this thesis is dedicated to the integration platform that has been created as an integral part of the thesis. The description of this platform contains information about used technologies, application architecture and an example of its usage based on the case study presented earlier.

Contents

Contents 1 Introduction 1.1 Loose Coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Case study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Integration Styles 2.1 Application integration . . . . . . . . 2.2 Application coupling . . . . . . . . . 2.3 Integration simplicity . . . . . . . . . 2.4 Integration technology . . . . . . . . 2.5 Data format . . . . . . . . . . . . . . 2.6 Data timeliness . . . . . . . . . . . . 2.7 Data or functionality . . . . . . . . . 2.8 Asynchronicity . . . . . . . . . . . . 2.9 Styles of integration . . . . . . . . . . 2.9.1 File Transfer . . . . . . . . . . 2.9.2 Shared Database . . . . . . . 2.9.3 Remote Procedure Invocation 2.9.4 Messaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

i 7 9 12 17 18 18 18 19 19 19 20 20 22 22 24 26 27 29 29 31 32 33 35 38 39 43 51 51 52

3 Messaging based systems 3.1 Message . . . . . . . . . . . . . . . . . . . . . . 3.2 Message Implementations . . . . . . . . . . . . 3.3 Message Channel . . . . . . . . . . . . . . . . . 3.4 Message Routing . . . . . . . . . . . . . . . . . 3.5 Message Transformation . . . . . . . . . . . . . 3.6 Message Endpoints . . . . . . . . . . . . . . . . 3.7 Synchronous and asynchronous communication . 4 Design patterns in the application integration

5 Enterprise Service Bus 5.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Message Oriented Middleware . . . . . . . . . . . . . . . . . . . . i

ii 5.3 5.4 5.5 5.6 5.7 5.8 Tightly coupled interfaces . . . . ESB aims . . . . . . . . . . . . . ESB capabilities . . . . . . . . . . ESB components . . . . . . . . . Open source ESB products . . . . ESB integration patterns . . . . . 5.8.1 VETO pattern . . . . . . 5.8.2 VETOR pattern . . . . . 5.8.3 Two-step XRef pattern . . 5.8.4 Forward Cache Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CONTENTS

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

53 53 55 57 58 59 59 60 60 62 65

6 Case study: Messaging systems work principles 7 Implementation 7.1 The origin of the name . . . . . . . . . . . . . . . 7.2 Concept . . . . . . . . . . . . . . . . . . . . . . . 7.3 Technology . . . . . . . . . . . . . . . . . . . . . 7.4 Classification . . . . . . . . . . . . . . . . . . . . 7.5 Architecture . . . . . . . . . . . . . . . . . . . . . 7.6 Processing sequence . . . . . . . . . . . . . . . . . 7.7 Configuration . . . . . . . . . . . . . . . . . . . . 7.7.1 Configuration model . . . . . . . . . . . . 7.7.2 Configuration example . . . . . . . . . . . 7.7.3 More on transformers and routers . . . . . 7.7.4 Performing the configuration . . . . . . . . 7.8 Integration design patterns supported by pESB . 7.9 Problems encountered during the implementation 8 Summary Bibliography List of Figures List of Tables Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

73 73 73 74 75 76 77 79 79 80 81 82 83 84 85 87 91 93 95

Acknowledgements

We would like to thank Dr Piotr Habela, who conducted a class on the software integration (Technologie Internetu), which inspired us to perceive more information about this subject. I, Adam Siemion, would also like to thank Remigiusz Weska, with whom I had have the pleasure to work and the company IMPAQ Sp. z o.o., where we — both — had been working on a project, which aimed to choose the best messaging based integration solution that would fulfill the requirements of our customer. That experience also motivated me to delve into the subject of integration.

1

Preface
Business environment is a constantly changing volatile environment that requires great flexibility from its participants. This flexibility requires the will to cooperate with each other in order to maximise profits or obtain other types of benefits. The cooperation between two business entities involves information exchange and collaboration of IT systems used by those entities. This collaboration might be limited only to the exchange of data or might be as complex as the usage of the partners software functionality. Combining two separate IT systems into one entity, capable of exchanging information and maintain a constant data flow between them, can be a very complicated task, especially in the business environment. This environment has a lot of features that contribute heavily to the difficulty level of this task. First of all, the system used by the business entities might not be an up-todate IT system, but a legacy system designed in the early 90s or even earlier. Those systems can be very complex and vital for the business so that it is not possible to simply replace them with new ones. Very often the complexity of those systems makes it too expensive to design, implement and introduce a new system with the same functionality as the previous one. The human factor also has to be taken into account, workers that have used the old system for a couple of years might resist against the introduction of a new system that will replace the one they are used to. Moreover, business entities that want to cooperate might be located in distant geographical locations. This fact adds additional issues into consideration. Issues such as communication reliability, security, communication errors handling, nonrepudiation and so on. The reliability of the communication depends also on the third party business entities such as external Internet providers, who are responsible for maintaining the Internet connection. Thus, there are also external factors that have to be taken into account while coping with this problem. Designing such a solution that would allow to combine the application used by both business entities is another issue that must be considered. First of all, the designers of such a solution must have a deep knowledge about the business processes in each of the entities in order to design an effective solution. Secondly, both of the interested entities must agree on such a solution. This can be difficult to achieve, because very often each of the business entity representative would put pressure on the designers so that the outcome solution would be based mainly on their IT structure (internal data model, processes, etc.). Creating a design 3

4

PREFACE

that would satisfy all of the participants can be a demanding and difficult task itself, not only because of this aspect, but the designers must also concentrate on the functionality, flexibility and reliability of the solution, which makes this task even more difficult. After the integration project is completed, the appropriate tool must be chosen to put the whole solution in motion. The IT market offers many available solutions that are specifically designed to solve those type of problems (i.e. building integration solutions). We can choose from both open source solutions such as the Mule or ServiceMix and propriety products from IBM, Oracle, Sonic Software, BEA and so on. Upon taking a closer look at those products and considering the problems and challenges of the application integration topic, we decided to make it our object of interest and the topic of this thesis. Having a limited amount of time and available resources we did not aim to create a solution that could compete with those made by large IT companies. Instead, we decided to take a different approach — create a lightweight integration platform. It would provide the basic functionality needed for an application integration combined with the ease of use and a short learning curve, so that the potential user with a knowledge in programming would be able to effectively create an integration solution without the need to sacrifice a large amount of time to learn the functionality of the software program itself. The simplicity of our platform, in comparison to the large and complicated tools offered by large IT companies, would be its greatest strength. We aimed to create a tool, based on widely available technologies, that could be used as a base for further development by adding extensions to it. The work itself has been structured in such a way that a reader would get an overview of the whole topic of integration and especially messaging systems before moving forward to the description of the created integration solution itself. The thesis is divided into two main parts. Part one contains chapters two, three, four and five, which describe the basic theory behind the subject of integration. While part two introduces the reader to the integration solutions currently available on the market such as Message Oriented Middleware (MOM) and Enterprise Service Bus (ESB), which are heavily using the concepts depicted in the part one and presents a case study that is aimed to present the usage of terms and concepts presented in the first part of this thesis from a practical point of view. Part two contains chapters six, seven and eight. The ninth and last chapter contains the summary of our work. 1. Part one begins with an introduction to the integration topic in chapter one. 2. Chapter two concentrates on different integration styles, with the detailed description of each style along with its advantages and disadvantages as well as the situations, when each of those styles might be used. 3. Chapter three concentrates on one particular integration style — the messaging. This style is the one that will be used in our own integration solution. In this chapter we will try to give a more detailed description of

PREFACE

5

this integration style. The description will cover two different concepts of communication using this integration style — the synchronous and asynchronous one and basic concepts directly connected with this integration style — Message, Message Channel, Message Router, etc. 4. Chapter four covers the concept of the design patterns in messaging systems. It gives the depiction of several selected design patterns with possible ways of usage and different variants that can be applied in different situations. 5. Part two begins with chapter five, which will introduce the concept of an Enterprise Service Bus (ESB). First, we will define briefly what an ESB is, then we will focus on the basis of the ESB — Message Oriented Middleware (MOM), advantages of introducing an ESB, its capabilities and finally, we will provide a couple of ESB integration patterns. 6. Chapter six presents a case study, which aims to present the usage of the concepts from the previous chapters in a real life business example. It starts with a general overview of the problem and goes through all phases of the integration process up to the final solution. 7. Chapter seven will be devoted to the description of the created integration solution. At the beginning, we will explain the concept of this approach to the integration, the technology it is using and its architecture. To better illustrate the way this system operates we will provide detailed sequence diagram with the detailed description of sequence of actions that are taking place when two systems communicate using our solution. Also, we would like the reader to be able to solve real integration problems using our product, thus as an example of usage, we provide an imaginary integration problem with possible solution using our system. 8. Finally, chapter eight will contain the summary of our work with final remarks regarding the goals that we managed to achieve and the ones that have not been achieved, along with possible reasons why it has happened. It will also contain the suggestions about the possible ways of further development and some ideas of extensions that could be made to make the existing tool more usable and enriched with new functionalities.

Chapter 1

Introduction
The task of integrating computer systems emerges as a response to a frequent need to connect multiple separate computer systems. Integrated systems are supposed to cooperate and provide unified functionality. Moreover, it is expected that the new — integrated — system will be operating on the data gathered by all of the participating computer systems. There are a lot of factors that make this a difficult and challenging task. Systems, which are supposed to be integrated, might operate within different organisations, might be designed using different technologies, might be legacy systems with no maintainers etc. The need to integrate might arise because of multiple reasons, for example, the merger of two companies, the merger of multiple branches of one company, the aim to have one system responsible for coordination of others. Another reason might be cost reduction. One unified system is cheaper to support and more effective in terms of business processing than two systems working separately [16]. Expected results of the integration are usually as follows: • cost reduction • increased effectiveness • improvement of business processes • unified flow of information Applications, which are supposed to be integrated can be of different origin. They can either be custom made according to the customer specification to suit customer needs or bought as a commercial-off-the-self (COTS) application and tailored to meet the requirements of the buyer. The main differences between integrated systems can also be of other kinds. They might be written in different programming languages and be of a different age. One of those systems, or even both of them might be a, so called, legacy application [7] — computer software created couple of years ago, not being improved any longer, quite often with no documentation available and designed as a local standalone system. Usually, those systems provide no communication capabilities. Network communication 7

8

CHAPTER 1. INTRODUCTION

and data exchange have not been taken into consideration during the design of those kind of systems. What makes the process of integration even more difficult is the fact that: • systems which are to be integrated can be spread geographically with machines, on which they are running, placed in distant locations • systems might be written in different programming languages • systems might not have any documentation • systems might be running in different environments (operating systems, hardware configurations, etc.) • systems might be managed by different organisational units or even companies • companies, which provide proprietary software, are not willing to participate in the integration process, because they do not want to reveal the internals of their products • workers may not be willing to adapt to changes made by the integration process Reliable communication between integrated systems must be assured using available communication facilities. The integration should be introduced without making too many changes to any of the existing systems. There is a very important reason behind this limitation. Computer systems such as those running within some financial institutions and managing data crucial for its activities cannot be redesigned and reimplemented from the scratch. It would be too expensive and risky for the business. The complexity and scale of those systems makes the processes of recreating them costing too much for an institution to afford it. All those problems and difficulties must be taken into consideration when an integration solution is being designed for a computer system. Because of the above reasons, the integration of multiple various computer systems can be considered as an especially difficult and challenging task. The aim of this thesis is to present the subject of an application integration from the theoretical point of view and use the presented knowledge to build a lightweight easy-configurable solution that will allow to integrate different systems in a quick and efficient way. The goals that are to be achieved and the basic concept of this system have been outlined in the first chapter of this thesis. Before going further on to the system architecture description, the way it has been designed and implemented along with the detailed description of its concepts and goals that have been set during the process of creating it, a brief overview and background on the subject of system integration will be given. This overview should allow the reader to get an overall view on the topic and gain knowledge about the concepts and terms used through this document such as loose coupling, design patterns and others that are essential for understanding the basics of a system integration.

CHAPTER 1. INTRODUCTION

9

1.1

Loose Coupling

Loose coupling [11] has recently become a very popular term directly connected with application integration. It can be said that loose coupling is one of the key concepts formed around this topic. This strong relationship makes it necessary to explain this term prior to moving forward to the next chapters. This will allow to have a full understanding of the topic, which is being described in this thesis. The main concept of loose coupling is that two communicating parties (systems or applications) should make minimal assumptions about each other. In other words, the less applications need to know about each other in order to cooperate properly, the better. Applying this principle should, in the long term, reduce the costs of maintaining an integration solution, make it more effective and reduce the costs of changes of integrated applications. Loosely coupled applications can be modified independently (to some degree of course), which means that changes made within one application do not enforce changes in the coupled application. Therefore, using loose coupling in application integration makes an integration solution more flexible and change tolerant. This flexibility derives from the fact that connected applications do not have to be adjusted after changes done in on of the systems taking part in the communication. If processing of one of the applications is based on the information about the internal business logic of the other application (e.g. data format), then any changes made to that logic will automatically enforce the changes within other application. Thus, the less dependent, on this type of information about each other, the applications are, the more flexible communication between them can be maintained. Application coupling is a multidimensional issue that covers not only the issues connected with the design and implementation of the application. Having that in mind it seems obvious that loose coupling between applications must be viewed from different aspects. For example, integrated applications can be loosely coupled in time — they do not have to be working at the same time to cooperate — using queues. They might also be loosely coupled in format — every system might have a different data model — using a component responsible for transforming messages exchanged between the applications. Measuring the degree of loose coupling is a separate problem. When an integration solution is being designed with the concept of loose coupling in mind, it would be desired to have the means to measure how loosely coupled the integrated applications really are. In order to be able to do this a measurable feature of both applications must be found. Such a feature should allow to conduct a measurement and give results that can be compared. The feature that could be used for this purpose is the number of changes that can be made within an integrated system without the need to interfere with the integration solution as a whole. Changes can concern business processing within an application, data format and so on. The more changes can be made within an application without the need of altering the solution itself, the higher the degree of loose coupling is. Of course, many measures can be thought of and used. But this one (number of changes) concentrates on the key concept of loose coupling and from that point

10

CHAPTER 1. INTRODUCTION

of view, it might be considered as the best one to use. Moreover, this measure emphasises the main idea of loose coupling that the integrated systems should make as little assumptions as possible about each other. The fewer assumptions will be made, the more changes can be made inside the connected system without affecting the operation of others. The opposite concept of loose coupling is tight coupling. Tight coupling might be depicted using local method invocation as an example. Local method call imposes a lot of assumptions on the caller, which are as the following: • called method must be written in the same programming language as the calling method • the exact number and type of the arguments of the called method must be known • called method must run in the same process • both calling and called method must use the same data types format • both calling and called method must use the same internal data representation format The consequence of those assumptions is that tightly coupled local method invocation differs in many ways from loosely coupled communication, based on the messages for example. First of all, local invocation starts to perform its activities immediately after receiving a call from the calling method, thus there is no latency between the method call and start of the processing. Local method invocation is very fast, efficient and reliable in comparison to a remote communication. Although, the calling method must wait for the called method to finish its processing and return the result (synchronous method call), the processing speed is much greater than in the case of a remote communication. In case of a remote call waiting time increases because of the distance, latency of the connection between the applications, connection medium quality and so on. Because of the communication speed and the fact that the method invocation is local, the security and communication issues (security, reliability, performance, efficiency) that would come up in case of a remote call, are not a problems that have to be worried about. When a local call is being invoked, it is certain that it will reach its destination and will be performed. In case of a remote call there is no guarantee that the call has reached its destination. There might be a communication failure, connection might be broken, the remote host might be out of order or even the data can be altered while being transferred from the sender to the caller. All those problems make a remote call far less reliable than a local call. As it can be easily noticed, local — tightly coupled — method invocation is simpler, more efficient and generates far less issues to worry about. The source of this efficiency and simplicity lays in the assumptions made by the applications about each other. In order to make integration easier, many communication technologies use the same semantics as a tightly coupled local method calls, to invoke their functionality and exchange data between applications. Those technologies include:

CHAPTER 1. INTRODUCTION

11

• .NET Remoting [15] • Java Remote Method Invocation API (Java RMI) [8] The main advantage of this approach is that it is easier for the developers, who are used to invoking local methods, to start using these technologies. Therefore, using those techniques may lead to making the same assumptions as in the case of local method invocation. However, it should be kept in mind that while those assumptions are valid in local environment, many of them are not valid in the case of remote calls (making them valid, if possible, will greatly restrict the flexibility of the integration solution). When integrating applications it is not usually desired for a calling application to wait until the results of the remote processing will be available. Such a waiting, at best, might lead to delays that are very often not acceptable by the business entities taking part in the integration process. Moreover, in case of a communication failure (e.g. due to lost of connectivity) an application can get suspended waiting for the response. This can lead to an application crash. The processing time of a remote call is also much longer than in the case of a local call, and the call as a whole is far less reliable. When a remote call is being made it cannot be assumed, as in the case of local call, that a response to this call will be received, because there might be communication failure, crash of the remote system, and so on. One of the assumptions made in the case of tight coupling is that both called and the calling method are written in the same programming language. This assumption significantly reduces the scope of possible integration scenarios, because it is only possible to integrate applications written in the same programming language (e.g. it is not possible to integrate applications written in Java and C# using JAVA RMI). This restriction highly reduces the flexibility and scope of applications of the technologies mentioned above. What is more, it makes it impossible to integrate newly written systems with the legacy systems. As it can be easily seen, this approach is also burdened with problems that do not occur in the case of tightly coupled local calls.An example showing what problems can appear while trying to integrate systems with tightly coupled dependencies can be found in [6], along with the detailed description of this approach and problems that it might cause. Loose coupling apart from being a popular term is also one of the core concepts of an application integration. By making integrated system less dependent from the things such as the programming language in which they are written, their data model, internal business logic and architecture they are more flexible and change tolerant. This approach allows to modify one application to some point without the negative effects on the communication with the other system. This assures flexibility, which cannot be achieved in case of tight coupling due to restrictions mentioned earlier. Apart from such benefits as flexibility, scalability, higher tolerance for internal changes loose coupling has also some disadvantages. Designing a loosely coupled integration solution is a more complex task than in the case of tightly coupled solutions. A lot of new problems need to be solved in order to effectively perform

12

CHAPTER 1. INTRODUCTION

a loosely coupled integration solution. The difficulty in designing such a solution also results in more difficult development, error tracing, debugging, etc. To sum it up, loose coupling minimises the interdependency among systems in terms of time, information format, and technology at the cost of more sophisticated design and implementation. Now, since the loose coupling term and its role in the application integration has been covered, let us move on to a more detailed description of different integration styles along with the benefits they bring and — of course — threats, which might occur while using them in integration solutions. But before that a case study will be presented as a practical illustration of the discussed issues.

1.2

Case study

An example will be discussed now in order to illustrate all those information provided in the above paragraph. The example will cover a simple case study, which covers banking application integrated with the front-end Web application. This application will allow users to transfer money between user accounts. The applications will communicate using the TCP/IP protocol stack, which is the most common and wide spread set of communication protocols. The presented source code snippet accepts the following information on its input: • transfer title • destination account number • transfer amount The sample source code snippet, written in C#, could look like this:
String hostName = "www.mybankingapp.com"; int port = 8080; IPHostEntry hostInfo = Dns.GetHostByName(hostName); IPAddress address = hostInfo.AddressList[0]; IPEndPoint endpoint = new IPEndPoint(address, port); Socket socket = new Socket(address.AddressFamily, SocketType.Stream, ProtocolType.Tcp); socket.Connect(endpoint); byte[] amount = BitConverter.GetBytes(1000); byte[] transferTitle = Encoding.ASCII.GetBytes("My Transfer"); byte[] destAccNumber = Encoding.ASCII.GetBytes("1234445321234321"); int bytesSent = socket.Send(amount); bytesSent += socket.Send(transferTitle); bytesSent+=socket.Send(destAccNumber); socket.Close();

Above source code excerpt first initiates the connection to the banking system, then sets an amount of money that will be transferred to the destination account along with the transfer title and the destination account number and finally sends that information as a byte stream. Of course in the real life this method would

CHAPTER 1. INTRODUCTION

13

be much more sophisticated, but the goal here is to show general concept, not to write a complete business solution. The communication solution presented above is quite straightforward and simple. It does not require the usage of any sophisticated integration software. But this solution carries hidden problems that can be very hard to track and repair. In multiple books about network programming the above solution would be presented as the one, which enables to communicate the client (presented above) with the server regardless of the operating system and programming language these two systems are using. This is not completely true, as it will be explained later. In order to obtain data that will be sent to the banking application, the transfer amount, transfer title and destination account number are converted to arrays of bytes. Then each of them is sent to the destination. The BitConverter class is used to convert transfer amount to the array of bytes. The conversion made by this class is performed using internal memory representation of a given data type (integer in this case). .NET uses 32-bits integer type and this type will be used in this case to make the conversion of integer to a array of bytes. Other systems might use not a 32 bit representation, but a 64 one for example. In case of a system using 64 bit representation it will read not 32 but 64 bits from the incoming byte stream. What does this mean in the case of our example? If the destination system uses the 64 bit integer representation it will read not only the 4 bytes of the transfer amount but also the preceding 4 bytes of the transfer title and try to interpret whole 8 bytes as an integer. This difference in data types would cause a different amount of money being transferred than the user had initially desired! Apart from that, for the same reason the destination account number would be different than the one given by the user. Such a behaviour is at least undesirable and will lead to the disastrous effects both for the bank and the client. Moreover, client and bank computer systems may use different formats to store numbers. One of them may use big-endian system, which stores numbers starting with the highest byte first, while the other one may use the small-endian system, which stores numbers starting with the lowest bytes first. This will also cause difference in the transfer amount! At least two assumptions must be made about integrated systems in order for the above solution to work properly. First one that both of them have to use the same data types, and the second one that both of them have to use the same internal number storage format. However, it is not the end of the restrictions imposed on by this approach. Upon the closer examination of the above source code, a couple of things might be spotted. First of all, the connection information have been written directly in the source code. Any change concerning this data, like changing the destination host name or adding an alternate destination address would require altering the source code. In order to take effect of those changes the whole application would have to be recompiled and redeployed. In the case of a simple application it might not appear as a difficulty, but when more complex, critical applications are being concerned such a way of performing changes can become a very serious

14

CHAPTER 1. INTRODUCTION

issue. It would significantly increase the cost and time needed to introduce even the simplest change to the application. The above source code should be written in such a way that changes could be made to it in the most efficient possible way (efficiency in that case covers both time and cost efficiency alike). Furthermore, the usage of client-and-server mode assumes that both the server and the client are connected to the network at the same time. If one of the participants is currently not available, because of network problems, too high network traffic, connection link problems, etc., then the connection cannot be established and the data cannot be exchanged. It has already been mentioned before that in case of the presented solution any changes made to the application require changes within the application code itself. Those changes would have to be made each time a destination of the request being sent would change (that would involve changing the code, recompiling it and redeploying application). The same way of introducing changes to the application — changing the source code, recompiling it and redeploying — would also have to be taken also if there would be a need to change the number of parameters being sent to the banking application. But this time the changes would have to take place in both applications, because the banking application needs to know in advance the exact structure of the request so it can parse and process it correctly. This example shows how the tightly coupled solution could look like and what assumptions must be made in order for it to work properly. To sum it up, those assumptions are as follows: 1. Both client and host system must use the same internal number format representation and use the same data types. 2. Both client and host applications must be working and connected to the network at the same time in order to exchange data. 3. A client application must know the host location during the coding phase and this location cannot be changed without updating the application code itself. 4. A host application must know the exact number and type of the request parameters during the coding phase in order to process and interpret them correctly. Changes can only be made by changing the host application’s code itself. As it has been shown, a lot of assumptions must be made about the applications in order to make them communicate correctly. Therefore, the presented solution can be qualified as a tightly coupled and makes a good illustration of restrictions and limitations of this approach. In order to make this solution more flexible and less restricted it should be designed as a loosely coupled. Redesigning it to achieve that goal, would mean removing the restrictions limiting the flexibility of the presented solution. That goal would be achieved if there would no longer be a need for all those conditions, listed earlier, to take place in order to make the systems fully functional.

CHAPTER 1. INTRODUCTION

15

The first step on the way of decoupling the previous solution would be defining a platform independent data format, which would be resistant to issues connected with different internal format number representations and the usage of different data types. An XML can be a solution of this problem, it can be used to define request description format, then requests would be sent as XML documents. The destination host could parse it and extract all the necessary information to process the request. In order to remove the restriction caused by the the assumption concerning the host location and communication issues, a Message Channel design pattern (3.3) can be introduced. Message Channel is a logical address, which both the client and the host application use to communicate. The application has to be able to connect to the channel only, not directly to the server. This resolves the location issue — there is no longer the need to know the connection details about the host application — the channel is used to communicate. If the channel will be able to store the requests in form of a request queue, then the necessity for both systems to be connected at the same time will be eliminated. Every request will be stored in the channel until the destination system fetches it. The response delivery will work the same way. Thanks to that, the systems will be able to communicate without the requirement for both of them to be on-line at the same time. Applying the above mechanisms to the given problem would change the solution from a tightly coupled into a loosely coupled. With those mechanisms a solution — far more flexible than the previous one — can be obtained, free of all the restrictions of the first approach. From now on the client and the host application might be developed simultaneously, independently of each other. Changes done in one participant would not require altering the other one. However, there are also disadvantages of loosely coupled solutions — the main drawback is that it becomes much more sophisticated and complicated. This means that it would take much more lines of code to implement this solution, and it would not be so simple and straightforward. Also the process of debugging and testing becomes more complex. It is very important to keep that in mind while making the decision between loose and tight coupling model. The above example forms a good illustration of the differences between the tight and loose coupling. It shows what both of them have to offer and what their advantages and drawbacks are.

Chapter 2

Integration Styles

As mentioned before, application integration is a task of making two or more separate systems to work as one - combined - system and share functionality. Problems usually faced while integrating applications have been briefly described in the Introduction chapter . In this chapter we will describe different integration styles that might be chosen when coping with this challenging task. Every integration problem is different, therefore the choice of an integration style must depend on the circumstances of a particular situation. In order to choose an appropriate style - criteria, on which the choice will be made, must be defined. Martin Flower [7], for that purpose, provides the following: • Application integration • Application coupling • Integration simplicity • Integration technology • Data format • Data timeliness • Data functionality • Asynchronicity When the integration task at hand will be examined and analysed based on those criteria. Conclusion that will be the result of such an analysis can be very helpful later on. Basing on those conclusions, the decision on, which integration style will be the most suitable for the given situation, can be made. Before moving forward to the description of those styles, a brief overview of each of the mentioned criteria will be given. This will allow the reader to have a better understanding of those criteria and enable applying them in a proper way. 17

18

CHAPTER 2. INTEGRATION STYLES

2.1

Application integration

Before starting to think about applying an integration solution one must think about whether it is possible to develop a new, centralised local application instead of integrating separate already-existing systems. A local application will be easier to design and develop with far less issues, such as security, communication, data exchange, etc., to consider. Of course this might only be done when it can be afforded to implement a new application from the scratch. As it has been said before, in many cases it is not only too expensive but also impossible to recreate the system again from the beginning. Thus, there is no other way than to integrate existing systems. However, the necessity of integration should be always thought trough. It should be undertaken only if it is necessary. If there is another way of solving the problem, in a less expensive and more effective way, it should be considered before.

2.2

Application coupling

The application coupling has been discussed in details earlier (1.1). The main rule, which should be kept in mind is that while choosing an integration solution one should try to pick the one that will enable to achieve the greatest decoupling of the integrated systems (sample way of measuring the decoupling level has been described earlier). This will ensure that the systems will be more flexible. Flexibility allows to make changes within the application without the need to change the whole integration solution and the way applications communicate and exchange data between each other. In a long time perspective this feature might appear much desired. It will allow to upgrade one of the systems or enrich it with new functionality without the need to change the remaining integrated applications.

2.3

Integration simplicity

When deciding, which integration style is the most appropriate, the simplicity of the integration solution should be one of the most important factors being taken into consideration. It is defined not only by the complexity of the design and used technologies, but also by the number of changes that need to be done in integrated applications in order to integrate them into one system with shared functionality. Although, on the other hand, the simplicity should not be enforced at the cost of integration quality or the functionality of integrated systems. The main effort should be put into making the integration process as simple as possible and limiting the changes that need to be done in integrated applications. While, at the same time, the desired functionality that has to be achieved and the overall quality of the integration solution should be maintained. However, it is worth keeping in mind that sometimes it is better to choose a more complex solution in exchange of a higher quality and flexibility of the final result.

CHAPTER 2. INTEGRATION STYLES

19

2.4

Integration technology

The market of integration technologies is constantly evolving, new products and standards are being introduced all the time, thus choosing the most up-to-date technology can be tempting, but it is not always the best thing to do. The costs of introducing a new technology must be considered in the first place. Products featuring state-of-the-art technologies are not always the ones that suit needs in the best way. What is more, deploying a product using elaborate technology into a company requires additional time to master it and gain some experience. Time spent on training consumes additional resources and increases overall cost of the project. This extra time might not always be available. Also, the failure risk rises when the development team have to use the technology they are not experienced with. Furthermore, new solution that has just emerged can be unstable, not tested sufficiently, inefficient, unreliable, etc. Therefore, introducing the newest technologies into the solution can not only significantly raise the overall cost of the integration, but also - what is more important - raise the risk of integration failure. All those things mentioned above should be taken into consideration when choosing what technologies to use to perform integration.

2.5

Data format

Integrated applications need to exchange data in order to perform their activities, data that will be legible to all of the participating applications. To fulfil this requirement unified a data format, readable for all of the integrated applications, must be set. This can be achieved by formatting data to one common format within the integrated applications or by using a component (translator), which will translate data to an appropriate format and ensure it is readable by the data receiver. The first option may be considered as a worse one, because it requires more code modifications within the integrated applications, so it may collide with the Integration Simplicity criteria. When choosing the common data exchange format, it should also be taken into account that the nature and format of data produced by the integrated applications might change over time and that those changes will affect the integration solution. This means that the transformation of this data to the common data format will also change. Because of this, an integration solution should be flexible and able to adapt to data format changes within integrated applications.

2.6

Data timeliness

Another factor that has to be considered when choosing an integration technique is time. In particular, the time period from the moment data is being published to the moment when published data is being consumed by another application. This is one of the main factors that has the influence on the overall system performance. An effort should be put into minimising the time needed to exchange data between applications. The shorter this time period will be, the faster integrated application can receive data, start processing it and return results back to

20

CHAPTER 2. INTEGRATION STYLES

the data sender. In the case of the synchronous communication scenario, short data exchange time is essential to prevent delays in the application processing. Moreover, long delays may cause another problem. The data, while being transferred to the destination, may become stale. In this situation the processing of this data can lead to errors that may have serious consequences for the business. This issue, named the data timeliness issue, is especially important in case of applications that deal with volatile data, which changes very frequently in a short periods of time. Delays are not the only threat connected with this issue. A so called deadlock situation, may also occur when the sender is waiting to receive the result of the processing from the receiver and the receiver application is out of order, because it had just crashed or because of other reason, it cannot currently process the sender’s request. In that case the whole system is suspended and cannot perform its activities. The mentioned issues make the integration process even more complex and should be taken into consideration, when choosing the most appropriate integration technique. The solution, which in given circumstances will provide the shortest latency, should be chosen. This should prevent communication and processing deadlocks and errors caused by the stale data.

2.7

Data or functionality

When deciding which integration technique to use, another thing should also be considered. It must be decided if the applications will share data or functionality. Sharing of functionalists makes an integration more complicated than just sharing of data. It is more difficult to design and implement and has a significant impact on the integration process. It requires different approaches and techniques than data sharing and is harder to achieve. The difference between those two approaches has been described later on in this chapter, in the section ”Shared Database” (2.9.2) and ”Remote Procedure Invocation” (2.9.3) of ”Styles of integration”.

2.8

Asynchronicity

Another issue, which should be thought through while designing integration solution is the way, in which the integrated applications will communicate. There are two possibilities: • synchronous communication • asynchronous communication In a synchronous communication scenario an application invokes the functionality of the remote system and waits until the remote system will process the request and return results. After receiving the results it continues to perform its activities [2]. In case of an asynchronous communication scenario an application continues to perform its activities right after sending a request and does not wait for the

CHAPTER 2. INTEGRATION STYLES

21

results from the remote system. When the results finally come the sender system is notified then it postpones its current activities and processes the response from the remote system. This type of call - the sender does not need the results of the request and it is the asynchronous call - is named ”Fire and Forget” and is a design pattern [5] used in the application integration. Synchronous communication is simpler to design and implement, but reduces the overall performance of the system. Also, it may cause deadlocks (see the above section ”Data timeliness” (2.6)) and increase time spent by the system in the idle state - waiting for the response from the other system. Asynchronous communication, on the other hand, is more effective and offers a greater performance at the cost of greater design and implementation complexity. Synchronous communication is more suitable in cases when there is no need to create a complex solution, when messages being sent are small and the processing time is negligible. In that case the delays caused by waiting for the response will not affect the overall performance of the application in a noticeable degree. In other cases when an application needs to maintain the request-response interaction model in order to provide desired functionality (e.g. web browsers, online chats, etc.), a synchronous model of communication is also necessary. Asynchronous communication can be used when the sender does not expect the response to arrive right away. Amazingly, this situation occurs very frequently in the real life. For example, after filling in the form for a VISA, we do not expect the embassy to examine our application before we leave the building. Also, we are not waiting inside the embassy for the decision. Instead, we can continue with our lives. Similarly, after sending a letter in the post office, we do not expect the post office to deliver the letter before we leave. Those analogies are very similar to the working of the asynchronous model. They are proving that this method of communication is very popular in the real life, therefore it also has to be available in the computer software. Asynchronous communication might be more efficient, especially when there is no place for delays, caused by the waiting for a reply of the request. An asynchronous application instead of waiting for the response, as it is in case of a synchronous one, might continue its processing. Although, it requires solving some additional issues, such as the ability to process the data received in a response for the request sent earlier, what complicate the design of an asynchronous application, it may significantly improve its performance. Moreover, the asynchronous communication is more reliable than the synchronous one. Because the asynchronous model usually involves the usage of queues, which can store persistently every received message, it guarantees that no message will ever be lost. Even if the receiver system is currently not operating, the queue will store all message designated for it, and when the system will be online again it will fetch all of them. Another advantage of the asynchronous model is the fact that it enables to create systems more resistant to high-loads. The difference between those two models is the way they behave during high traffic. In such a situation a synchronous application would not be able to provide a service to all clients, some of them would get an error, some of them would get no response at all, finally even the whole application could become inaccessible. An asynchronous applica-

22

CHAPTER 2. INTEGRATION STYLES

feature efficiency reliability resistance to communication errors resistance to high-loads design & implement difficulty

synchronous +

asynchronous + + + + -

Table 2.1: Trade-offs between synchronous and asynchronous model tion, on the other hand, would statistically process each request longer, because it would have a lot of requests waiting in the queue, but sooner or later each of them would be processed, no request would be left without a response. The trade-offs between synchronous and asynchronous model has been summed up in the table 2.1.

2.9

Styles of integration

There are a few different styles of integration available to the developers faced with the problem of integrating computer systems. They vary depending on the complication level, difficulty of design and implementation and so on. Criteria presented in the previous sections can be helpful in determining which of those styles to choose to solve an integration problem. Each of those styles addresses some criteria better then the rest of them. Integration approaches can be grouped into four main types. Those integration styles are as follows: • File Transfer • Shared Database • Remote Procedure Invocation • Messaging Each of those techniques has been developed to handle the same task - the application integration. Although, the task remains the same, the approach represented by each of them is different. Every one of them is more sophisticated than its predecessor (e.g. the Shared Database is more complicated solution than the File Transfer). When faced with the application integration task the point is not to use the same technique in all cases, but to be flexible and basing on the criteria described above choose the most suitable style for a given task. More than one style can be used to achieve the best final result. As mentioned before, Messaging would be the style on which this thesis will concentrate, but other styles will also be briefly described to give a wider scope of possibilities at hand.

2.9.1

File Transfer

The File Transfer (Figure 2.1) is the simplest integration style. The main idea behind this technique is to use files as a data transfer mechanism between

CHAPTER 2. INTEGRATION STYLES

23

Figure 2.1: File Transfer style
(source: Enterprise Integration Patterns [7])

applications. Because files exists on every operating system and almost every programming language has files operations, it makes them very universal solution for the purpose of information exchange. Also, as using files does not require any additional integration tools and as they are already available, they might seem to be an obvious solution, but they do have lot of disadvantages. What is required in order to integrate application using this technique is an agreement on the format of the file used to exchange data. In most cases two special components (Figure 2.1), usually designed by the integration team, are created for that purpose: • Export - component responsible for putting the data from the Application A into the file (according to the file format) • Import - component responsible for reading and parsing the data from the file (according to the file format) and inserting that data into the Application B Apart from the file format another arrangements must be made. Naming convention for the files has to be agreed, so that the file names remain unique (it should be impossible for two separate files to have the same names). This is very important in order to avoid name conflicts and situations, when the file with old data would be processed. Another important issue is to decide when the files will be written and read. Creating and processing such a file too often will burden an application unnecessarily. Usually some fixed time periods are set based on the business activity cycles, e.g. files can be created on daily or weakly basis. Basing on those time periods the recipient application (Application B) checks if there is a new file available to process. If too large time period is set then the application could be desynchronised and errors in data processing could arise, because by the time the data in a shared file would be consumed by the second application, they could become stale and processing them might lead to errors. When a file is created and data is being written to it, the lock mechanism must be set in order to make sure that the other application is not trying to

24

CHAPTER 2. INTEGRATION STYLES

Figure 2.2: Shared Database style
(source: Enterprise Integration Patterns [7])

access the same file. This issue is also very important and should be taken care of in order to prevent errors while reading data from a file (e.g. unexpected end of a file). Application using the File Transfer technique can be modified without affecting each other, because it has components, responsible for the export and the import, separated from the application itself. Also because they only need to access the file containing exchanged data, no knowledge about the internal processing performed in each of them is required (such as method return types, method names, number and types of parameters passed to method and so on). One of the main disadvantages of that technique is the fact that data is being synchronised in a batch mode. There might be situations when data processed by Application B is no longer valid, but because synchronisation is taking place not in realtime but in time periods, Application B is not aware of that fact, until the next synchronisation process. This excludes this type of integration in certain situations, e.g. checking current bank account balance.

2.9.2

Shared Database

The Shared Database (Figure 2.2) is another integration style. It is a more complex and sophisticated approach than the File Transfer. The idea of this approach is to use a central data store that all applications share and can access at any time. This approach overcomes the main drawback of the File Transfer style, i.e. the lack of timeliness (data is not being available at proper time due to the fact that is being exchanged in a batch mode). As explained before, files with data are created repeatedly at some fixed amount of time, so this solution is not applicable

CHAPTER 2. INTEGRATION STYLES

25

in situations where data has to be propagated to all applications in realtime. Shared Database style does not have this disadvantage. Moreover, in case when each application changes shared data very frequently (couple of times per second) the usage of File Transfer would be very inefficient and would lead to many problems. If integrated applications use the shared database then even frequent data changes are not an issue, because data changed by one system is instantly available to the others. Database engine is used to handle transaction issues in order to prevent any deadlocks and data inconsistencies. When an error in data appears it is easier to rollback a transaction than to return the application to the state prior processing of a file in the File Transfer integration style. Moreover, small but frequent changes make it much easier to fix possible errors, without loosing too much of the processed information, than in the case of one huge daily or weekly based update. Almost every programming language gives the means to work with the relational database using SQL queries. Also, there are a lot of tools available on the market that make working with the databases simple and effective. The above arguments might be considered as advantages of using a Shared Database, but there are are also disadvantages of this solution. When using a single shared database all of the integrated applications have to produce data compatible with database schema. In order to achieve that they must be modified. Designing the database schema, in this case, is the most difficult part of the whole integration. All participating departments, companies, etc. have to agree on one common schema. This can be a very challenging task, especially when each of the participants wants to save some parts of their own schema, the one they got used to. After the schema is set the existing data must be transformed, without a loss of any information and loaded into the new database. Data transformation can also be a very complex process, because in some cases the existing data has to be heavily altered before it will become compatible with the new schema. Moreover, each of the participants has to have a very good knowledge about the model of the data used by his/her system. As experience shows, this is not always the case, because some systems might have been created many years ago, there might be systems without documentation and so on. To sum it up, changes within applications can be made and they will not effect the integration solution as long as the output data of the application is compatible with the database schema. Apart from advantages that come from overcoming the drawbacks of the previous approach, this integration style has also some disadvantages, which are worth mentioning. The usage of one shared database by multiple systems can cause some serious performance issues or even lead to deadlocks. When two or more applications are trying to modify the same data simultaneously, a deadlock might occur because one application will place a lock on the accessed data that will prevent other applications from modifying it at the same time. Moreover, having one database used by all the applications incorporates singlepoint-of-failure. If the database is not operating then none of the applications will be able to perform their work.

26

CHAPTER 2. INTEGRATION STYLES

Figure 2.3: Remote Procedure Invocation style
(source: Enterprise Integration Patterns [7])

2.9.3

Remote Procedure Invocation

The Remote Procedure Invocation (Figure 2.3) represents yet a more sophisticated approach to the integration problem than the two previous styles. Similarly, as in the case of shared database, the increased level of sophistication enables to overcome the drawbacks of the two previous approaches. The Remote Procedure Invocation is a mechanism that allows one application to invoke the method in the context of another - remote - application. Along with the invocation of an appropriate method all of the required information is being passed on. The remote party returns the result after processing the invocation [8]. The result might be as simple as a Boolean value, indicating whether the operation was successful or not or as complex as data structure containing the information about the customer. In this approach, when one application needs data owned by another application it makes a direct call to that application. If data needs to be modified it is also done by a direct call to the application that owns this data. Each application manages its own data. No data is being duplicated in multiple systems, as it is in case of a File Transfer. Also, there is no need to change integrated applications in case the data, managed by one of them, changes. If there is a need to introduce a new way of data processing than a new remote method has to be implemented. Although this approach requires the integration team to agree, in advance, on the names of the methods that would be available for remote invocations, what kind of data would be passed with an invocation, what information would be returned and so on. Prior invoking remote method some knowledge about the other party is required: • number and types of arguments • type of the result

CHAPTER 2. INTEGRATION STYLES

27

Figure 2.4: Messaging style
(source: Enterprise Integration Patterns [7])

• how error situations are handled: is the exception being thrown or some negative value is being returned? There are many existing implementations of Remote Procedure Invocation, just to name a few - CORBA, DCOM, .NET Remoting and JAVA RMI. As mentioned in the previous part of this paper covering loose coupling, those implementations use the same syntax and semantics as the local method calls. This make it easier for the developers to use those solutions but this similarity can become an issue if there is no understanding of the differences between the remote and local calls. In case of the lack of this understanding the implemented solutions can be slow and unreliable [10]. The Remote Procedure Invocation can also cause another problem. Using this technique results in tightening the coupling between integrated applications. Although, they do not share the common data storage (as in the case of File Transfer or Shared Database) the methods called remotely cannot be changed without affecting the integrated applications, i.e. the number of parameters or the return type of the remote method cannot be changed without modifying the systems calling those methods.

2.9.4

Messaging

The Messaging (Figure 2.4) is considered as the integration style involving the smallest amount of assumptions about other parties and hence the most promising for performing well in the integration task. Despite that fact, this is the most sophisticated technique that can be used to solve the integration problem. It can be said that this approach combines the features of the previous styles. Just like the File Transfer it allows the applications to be loosely coupled (sent messages can be transformed in order to comply with the format expected by the receiver, without the sender and the receiver being aware of the transformation itself), but it is also free of its weakness, i.e. high frequency of changes does not cause desynchronisation of the integrated applications and processing of stale data by one of them.

28

CHAPTER 2. INTEGRATION STYLES

The Messaging enables quicker data exchange and collaboration between integrated applications. In contrary to shared database approach it does not couple applications to one database. The Shared Database also does not handle well with very frequent data changes, especially if the data is being shared between applications placed in different locations, while Messaging is free of this problem. The usage of Remote Procedure Invocations forces to make many assumptions about applications and as a result couples them tightly. What is more, the semantics and syntax of those invocations can be misleading, i.e. causing the developer to think about remote invocations in the same way as he/she thinks about local invocations. That way of thinking may lead to slow and ineffective solutions. Messaging gives the means to transfer data in a quick and efficient way (large number of small data units), with the receiver application being notified automatically if there is another data waiting for the processing. Messaging also provides a retry mechanism in order to assure the delivery of the sent data. Applications integrated using this technique have no need to use the same unified data structure and are not forced to make so many assumptions about each other as in the case of the Remote Procedure Invocation. Messaging also offers asynchronous data transfer, which means that the sender does not have to wait for the results in order to continue its processing. It also does not require both systems to be operational in order to pass data from the sender to the receiver. More about the asynchronous method of communication can be found in one of the previous sections called ”Asynchronicity” (2.8).

Chapter 3

Messaging based systems

Messaging was one of the integration techniques that had been briefly described in the previous chapter. In the current chapter this description will be broadened and detailed. As mentioned before, messaging is the most sophisticated integration style that in exchange of high complication provides high decoupling, asynchronous communication between integrated applications and other features that make this solution the most flexible among all the described in the previous chapter. The previous chapter covered messaging in comparison to the remaining three other integration techniques. This chapter will describe the concept of messaging, key terms connected with this topic and the mechanism by which the messaging based solutions work. Before going deeper into the description of this technique an understanding of the basic messaging terms and concepts — such as channel, message, routing, transformation, endpoint, synchronous and asynchronous communication — should be perceived.

3.1

Message

In order to transmit the data, it first must be marshaled by the sender into a byte form and then unmarshaled by the receiver so that the receiver has its own local copy of it. During the transmission data is being wrapped into a Message (Figure 3.1). Each Message forms an undividable entity, it cannot be split into parts or divided. It is the data record that can be transmitted and read by the messaging system. In order to communicate the sender’s application must transform data that is being transmitted into one or more messages and then send those messages to the receiver. The receiver gathers these messages, extracts the data from them, merges them if the data have been split into more than one Message, and finally processes it. Messaging solutions guarantee delivery of the message to the receiver (it can be repeatedly transmitted from the sender to the receiver until the transmission will succeed). A message is the smallest undividable portion of data exchanged between 29

30

CHAPTER 3. MESSAGING BASED SYSTEMS

Figure 3.1: Design Pattern Message
(source: Enterprise Integration Patterns [7])

integrated applications. It consists of two parts: • Header — contains information used by the messaging system to describe the data being transmitted, information about the sender, receiver and so on • Body — contains the data being transmitted, usually this part of the message is treated as a black box by the messaging systems and sent between the sender and the receiver as it is Moreover, the message payload might contain special, separated section called Properties, which contain a list of key-value pairs, defined by the sender of a message. The messaging system does not differentiate types of messages being sent. The programmer can choose among different types of messages that can be sent. Those types are as follows: • Command Message — used to invoke a procedure on the receiver’s machine • Document Message — used to pass set of data to the receiver’s machine • Event Message — used to notify the receiver about some event that has occurred on the sender’s machine • Response-Reply Message — used to send a message, which requires a response from the sender • Message Sequence — used to send data using multiple messages The concept of sending a stream of data divided into discrete parts is not only used in messaging systems. It is also applied in the network protocols, where data is grouped into discrete units of data, i.e. datagrams/packets in case of the Internet Protocol (IP) and segments in case of the Transmission Control Protocol (TCP).

CHAPTER 3. MESSAGING BASED SYSTEMS

31

3.2

Message Implementations

The concept of a Message is used in different implementations of integration solutions: Java JMS, .NET Messaging, SOAP. Here are some brief information about each of those solutions: 1. JAVA JMS • Message is represented by the class Message. • Message consists of header, properties and body. • There are different types of messages depending on the message body, header remains the same for all types: – Text Message Message body contains the String object, which might the content of a text or XML file or just a text. This is the most common type of the message. To get the message content the method textMessage.getText() is provided. – Bytes Message Message body contains a simple array of bytes. This is the simplest and the most universal message type. To get the message content the method bytesMessage.getBytes(array) is provided, it copies the content of the message to array of bytes passed as an argument. – Object Message Message body contains a Java object that implements java.io.Serializable interface, so that it can be marshalled and unmarshalled. Method objectMessage.getObject() returns the serializable object containing message’s data. – Stream Message Message body contains the stream of Java primitives. In order to read data from the message body methods such as readBoolean(), readChar(), readDouble(), etc. are provided. – Map Message Message body contains a list of key-value pairs, just like java.util.Map with String objects used as keys. To get the value of some key the method getTYPE(KEY) is provided, where TYPE is value’s type and KEY is the name of the key, e.g. getBolean("isEnabled"), getInt("numberOfItems"). 2. .NET Messaging • Message is represented by the class Message. • Message class has the following properties: – Body — contains an Object that represents the message content – BodyStream — stores a content of the message as a stream

32

CHAPTER 3. MESSAGING BASED SYSTEMS

Figure 3.2: Design Pattern Channel
(source: Enterprise Integration Patterns [7])

– BodyType — specifies the type of data being sent in the message body (string, date, number, currency, etc.) 3. SOAP • Message is represented as an XML document, it contains an optional header and a required body. • An XML document is an atomic data record that can be transmitted. • SOAP messages can be transmitted using messaging systems (in that case the message contains SAOP message as its body).

3.3

Message Channel

The Message Channel (Figure 3.2) is used to transmit Messages between applications. It is a logical address of the message destination in a messaging system. It can be imagined as a pipe connecting sender and receiver inside which messages flow. Channels must be defined by the integration team and added to the messaging system. Newly deployed messaging system does not contain any channels. They must be created so that applications can communicate using them. Message channels have very useful functionality — if the receiver is currently not available, a message channel will store messages until the receiver will be up again and will fetch them from the message channel. This feature eliminates the need for both sender and receiver systems to be on-line in order to communicate. Sending application knows what kind of information it is sending and basing on this knowledge, it also knows what channel this information should be sent to. It does not have to know which application needs this information. It is sufficient to know that placing the information in a given channel will assure that it will be delivered to the application that needs it. The way the channels are implemented varies among different products, but in order to simplify the process of integrating applications, every messaging system provides an Application Programming Interface (API). The API contains methods for sending and receiving messages, which hides the details of the communication with the messaging system. The application does not have to know how the connection to the messaging system is set up, how it is reinitialized in

CHAPTER 3. MESSAGING BASED SYSTEMS

33

Figure 3.3: Design Pattern Router
(source: Enterprise Integration Patterns [7])

case of an communication error, how the message is being converted into a stream of bytes and so on. There are two types of channels: • Point-to-Point Channel — directly connects two applications. Data sent through this channel by the sender will be only available to the receiver. Once the receiver will fetch the message from the channel it will be deleted and no longer available. • Publish-Subscribe Channel — allows sender to send messages through the channel to more than one receiver (subscriber). Sender sends data into the channel, then — independently of the sender — each of the subscribers periodically checks whether there are any new messages pending in the channel or, another scenario, the subscriber might also be automatically notified about new messages by the messaging system. Then the receiver fetches them from the channel. The above division is based on the way the messages are being distributed from the sender to the receiver. Other division is based on the purpose of the message channel: • Datatype Channels — used to avoid confusion when different datatypes are mixed within the same channel. Each type of data being sent has assigned a different Datatype Channel. • Invalid Message Channel — used to send error messages to the sender and to provide feedback from the receiver in case of data errors or other failures. • Dead Letter Channel (Dead Message Queue) — used by the messaging system when the message cannot be delivered to the receiver.

3.4

Message Routing

In the simplest case of integration solution, systems are connected directly through Message Channels. Those connections are straight links from the sender to the receiver. A Message Channel decouples the sender from the receiver of the message. Because of this it is possible to have more then one application sending

34

CHAPTER 3. MESSAGING BASED SYSTEMS

messages using the same Message Channel. Quite often it is necessary to perform some processing of the sent message before it will be directed to its destination. Messages sent by a single sender may require different processing while being sent through the Message Channel. Different processing can be required depending on the message origin, business rules, message type or some other criteria. In order to assure this, each Filter component connected to the channel has to know those rules. However, if the rules change, then all of the components within the Message Channel also have to be changed so that they would have updated rules. This would make any changes to existing solution very time consuming and ineffective, both time and performance like. Very often the components that would be used to determine the further processing of the message could not be changed because it would be too expensive, time consuming or even impossible. Moreover, in order to determine the further processing of the message (e.g. state if the message is destined for this component or not using business rules based on the message content) the component has to fetch the message from the Message Channel. But after the message has been consumed, it cannot just be put back to the channel the same as it was before, because the messaging system does not enable that. In order to solve the problem of redirecting, the message depending on a set of conditions without involving all components participating in the message processing a new type of component has been introduced into messaging solutions. This component is called a Message Router (Figure 3.3). The role of a router is to decide where the particular message should be delivered basing on a set of defined business rules. Other components using messaging system are not aware of the router’s existence, because it does not change message content, it only redirects messages to the proper channel. If the need to change the decision rules will arise, then only the router component has to be changed, other components remain unchanged. A router is a single point where the decision concerning further message travelling path is being made, therefore in case of heavy traffic the routing component might become a system bottleneck, but the likelihood of such a situation might be significantly decreased by using several parallel routing components or by improving the hardware used to run the system. The Message Router needs to know the full list of possible message recipients along with rules that govern the routing process. The alternate solution, that can be used in case of frequently changing list of final recipients, is to let each of the recipient to decide whether to fetch the message from the queue or not. This alternative solution can be build by using Publish-Subscribe channels and Message Filters, it is called reactive filtering, while using a routing component is called proactive routing. There are a few possible variants of a Message Router that can be used in integration solution: • Fixed Router This is the simplest variant. In this variant the router has one input and one output channel defined. It does not perform routing as such, but is used to decouple systems or pass messages between different integration

CHAPTER 3. MESSAGING BASED SYSTEMS

35

Figure 3.4: Design Pattern Translator
(source: Enterprise Integration Patterns [7])

solutions. Most often this type of routers are used combined with a Message Translator or a Message Adapter in order to pass the message between different integration solutions or different types of message channels. • Content-Based Router This type of routers use the properties of the message such as, for example, the type of the message or the values of the specified message fields in order to determine the message destination. It is the most commonly used router type. • Context-Based Router This type of routers use the information about the surrounding environment to determine the message destination. Those routers can be used to perform load balancing or change the message destination if the original recipient is not responding. Context-Based Routers can be used to increase the flexibility and reliability of the system in case of unexpected errors. Routers can also be divided into two other groups: stateless and stateful. In the case of the first group, a stateless router only considers the message that it had just received and makes the routing decision basing on only single — current — message. A Stateless router, on the other hand, in order to determine an incoming message destination also takes also into account previous messages. This feature might be used to remove duplicated messages, for example.

3.5

Message Transformation

The concepts covered so far in this chapter concerned the way the message is being sent from one system to another. What has been omitted is the fact that in order to communicate, the content of the message has to be properly understood by the receiver. Message data has to be properly interpreted and used to perform necessary operations, in order to ensure that it must be delivered to the receiver’s application in the correct format. In the ideal solution both sender and receiver would use the same data format, but this situation is very rare. In most cases both sender and receiver would have to be modified so that they would use the same data format. This approach raises many problems to solve. Which data format should be used, the one used by the sender or the one used by the receiver? What kind of selection criteria should be used to make that choice? Moreover, making internal

36

CHAPTER 3. MESSAGING BASED SYSTEMS

changes to the integrated applications can be very difficult or in some cases even impossible. It may also cause some changes to the internal business logic of the application, which is an undesired situation, because integrated applications should be unaffected by the integration process as much as possible. Making such changes would also neglect the idea of loose coupling described earlier (1.1). After implementing that kind of change into both applications, they would not be loosely coupled anymore. The change in data format in one of them would have to be reflected immediately in the other one, otherwise the integration solution would not work as it was intended to. The simplest way to ensure that the data format of the arriving message will correspond to the internal data format of the receiver’s application is to use a separate component, which will changethe message body to the appropriate format. This component is called a Message Translator or a Message Transformer (Figure 3.4). The usage of this component enables to preserve the loose coupling between applications. In the case of a change of internal data format within any of the integrated applications only the changes in the component performing transformation are necessary, the applications will remain unaffected. This way they do not depend on each other, and changes made in one of them do not enforce to make changes in the others. The transformation process itself can take place on many different levels of data representation. It may refer to the name of the data fields, data representation in those fields, data structure as a whole (different ways of representing the data) and so on. Hohpe and Woolf [7] makes a division of different levels of data transformation and organises them in a similar form as the ISO/OSI model. This division is presented in the table 3.1. As it is being presented in the above table, the levels of transformations are divided into four layers: • Transport Transformations performed in the scope of the communication protocols (Transport Layer) enables data transfer between systems using different communication protocols and ensures reliable message transfer between those systems. • Data Representation The Data Representation layer performs the transformation concerning the representation of the data. Transformation within Transport layer operates on the stream of bytes, while the Data Representation transformation operates on the data representation (e.g. it changes the XML representation into name-value representation). • Data Types The Data Type layer performs the conversion of the data contained in the message. The conversions includes changing field names, changing data types of those fields, combining data from multiple fields into one or splitting data from one field into two and so on. The goal of this transformation is to make data comply with the data model of the receiver’s application.

CHAPTER 3. MESSAGING BASED SYSTEMS

37 Tools/Techniques Structural mapping patterns, custom code EAI visual transformation editors, XSL, database lookups, custom code

Layer Data Structures (Application Layer) Data Types

Deals With Entities, associations, cardinality

Transformation Needs (Example) Condense many-to-many relationship into aggregation Convert ZIP code from numeric to string. Concatenate First Name and Last Name fields to single Name field. Replace U.S. state name with two-character code. Parse data representation and render in a different format Decrypt/encrypt as necessary

Field names, data types, value domains, constraints, code values

Data Representation

Transport

Data formats (XML, name-value pairs, fixed-length data fields, EAI vendor formats, etc.), Character sets (ASCII, UniCode, EBCDIC), Encryption/compression Communications protocols: TCP/IP sockets, HTTP, SOAP, JMS, TIBCO RendezVous

XML parsers, EAI parser/renderer tools, custom APIs

Move data across protocols without affecting message content.

Channel Adapter, EAI adapters

Table 3.1: Levels of data transformation • Data Structures (Application Layer) Finally, transformations within the Application Layer defines the entities, which are used in the application data model and sets the relations between them (e.g. Can customer have multiple bank accounts? Can bank account have multiple owners?). As each transformation can be performed in a separate component, the transformations themselves can be chained. Chaining transformations enables to perform complex changes to the transmitted data on different levels. It also enables the designers of an integration solution to combine different transformations into transformation chains to achieve desired effect in the most efficient way.

38

CHAPTER 3. MESSAGING BASED SYSTEMS

Figure 3.5: Design Pattern Endpoint
(source: Enterprise Integration Patterns [7])

3.6

Message Endpoints

Up to this point the main massaging concepts concerning the message transportation have been covered. Several topics connected with the message transportation have been described. Those topics cover issues such as the way the data is being sent between the applications using the messaging system, what are the basic units of exchanged data, how does the messaging system cope with sending them to the correct destination system, and finally, how the data is being transformed so that it fits the internal data model of the destination system. The Messaging system in most cases is a separate application responsible for dealing with all the issues mentioned above, but not only with them. The systems that are supposed to be integrated using the messaging application are also separate business entities. What must be taken into account is that integrated applications, in most cases, are complex systems, often few or more years old, usually older than the messaging system itself. Quite often, they do not have the means to communicate with the messaging system — to send messages to the messaging channels, to fetch incoming messages, to create messages and so on. They also cannot be changed or modified to gain those abilities, because the modification might be too expensive, impossible for technical or some other reasons. Thus, in order to make the messaging solution work, one more component is needed. Component, which connects the application with the messaging system, is called a Message Endpoint (Figure 3.5). It is a custom made component that performs all the operations mentioned above and enables integrated application to cooperate effectively with the messaging system. It is responsible for creating messages, sending them to the message channels, fetching incoming messages, wrapping data from the application into the appropriate message format, and unwrapping incoming messages to extract the data and pass it to the application. The Message Endpoint encapsulates the messaging system’s Application Programming Interface (API) from the integrated applications, which are not aware of all the operations being performed when they raise a request to send data to another application. Because the endpoint is written to work with a particular application and messaging system it has to be rewritten if it would have to cooperate with different applications or messaging system, than it has been designed for. One instance of the Endpoint can either send or receive messages from the channel, it cannot do both at the same time. This implies that an application can have several Message Endpoints attached to it.

CHAPTER 3. MESSAGING BASED SYSTEMS

39

Figure 3.6: Overview of a communication based on message design patterns
(source: Enterprise Integration Patterns [7])

The figure 3.6 summaries all of the terms explained so far. The application on the left side (Application A) wants to sends a Message to the Application on the right side of the above picture (Application B), the steps of that communication would look as follows: 1. Application A sends a Message using its Message Endpoint. 2. The Message is placed in the Message Channel. 3. The Message is directed to the Router, which decides where the message should be delivered (let us assume that the message would be delivered as presented on the picture). 4. The Message is directed to the Translator, which changes the message’s content format. 5. Finally the Message is delivered to Application B, which using its Message Endpoint fetches it.

3.7

Synchronous and asynchronous communication

Integration solutions can perform their activities in two possible ways. The term ”ways” can be interpreted in this particular case as the overall mechanism by which the system operates. Those mechanisms can be of two kinds: synchronous and asynchronous — two ways by which integrated system can communicate with each other using the messaging system. Those concepts have already been mentioned earlier in the part covering the integration criteria (2.8), but now we will try to take a more detailed look at them. The reliability of communication is a very important issue both in the synchronous and asynchronous approach. Some communication errors are beyond the scope of messaging solution (e.g. physical media error, hardware failure, etc.) but they might prevent messaging system from delivering the message to the

40

CHAPTER 3. MESSAGING BASED SYSTEMS

sender. Of course, sender can resend the message after some period of time to increase the possibility that the receiver will finally receive it. But in the case of some types of errors this will not assure a successful delivery. This drawback can also lead to the loss of application performance and processing speed. An application has to wait until it receives a response instead of performing its usual activities. In the case of one application accepting requests from several clients those drops in performance and processing speed can become even greater and lead to deadlocks. The main advantage of the synchronous approach is its simplicity. The application does not have to use additional resources to monitor the whole processing as it is in the case of an asynchronous approach. This simplified approach is sufficient in case of systems that do not send sophisticated requests that would require a lot of processing from the message receiver. A short processing time of requests will reduce the time the sender application has to wait until it receives an answer and as a result, do not affect the overall performance of the senders in a noticeable way. Asynchronous communication, in contrary to the synchronous one, is a much more complicated concept. The main idea behind it is that application after sending the request does not wait for a response, but continues its processing. It requires different approach in implementation and design. An application using an asynchronous model cannot be designed as a sequence of method invocations. It has to be designed in such a way that the remote functionality will be invoked without affecting the main application flow. This enforces different kind of application design than in the case of synchronous communication. A possible scenario might look as follows: 1. The sender application (Application A) sends a message (containing message identifier) designated to the receiver application (Application B). 2. If the receipt of the message has been confirmed by the messaging system, Application A stores the information about the sent message along with identifier assigned to it in the database and switches to perform other operations. 3. Application B is notified by the messaging system about a new message and fetches it. 4. Application B after processing the received message, sends a new message with the response for Application A request (the response includes message identifier sent in the request). 5. Application A receives a new message with the response for its previously sent request. 6. Application A looks up the database for the request identified by the message identifier contained in the received message. 7. Application A after gathering all required information, starts its processing.

CHAPTER 3. MESSAGING BASED SYSTEMS

41

This approach despite being more difficult in design and implementation, as it has been stated in the section ”Asynchronicity” of chapter ”Integration styles” (2.8), has a lot of advantages over the synchronous one, just to name a few: reliability, efficiency, resistance to communication errors, resistance to high-loads. Both of the presented approaches have their pros and cons. In some cases the synchronous solution is more advisable — when the processing speed is a priority and the communication model is simple, with simple request that can be processed quickly. In other cases — when the application reliability, error resistance or complex communication model has the priority — the asynchronous model has the superiority. Although the asynchronous approach requires a quite different architecture of the application and is more time and resource consuming, it might prove to be more applicable and result in a higher reliability and flexibility level of the final solution.

Chapter 4

Design patterns in the application integration

Design pattern is a term that has been adopted from the architecture to software engineering [1], which describes a well-known method of solving commonly occurring problem. In case of computer science design pattern should not be perceived as a ready-to-use solution, such as a source code, but only as a template that can be used in multiple situations. In this paragraph the usage of the design patterns in application integration will be discussed and the term design pattern itself will be explained in a more detailed way. Designing and implementing an application can be a very complex and complicated task. Applications may vary on the technologies used, working environment, performed task, complexity, and so on. But very often the designers would encounter the same problems to solve. The design patterns are a set of proved ways of solving those problems. They do not contain a ready solution that can be put into an application to solve a problem, rather, they are template that can be used to solve the problem [4]. Design patterns also occur in the messaging systems and application integration solutions. Some of the design patterns described in this section derive from the basic concepts of the messaging systems described earlier (3). Each of them is an answer for particular problem, for example, how to connect two applications within the messaging system — by using the Message Channel. As said before, design patterns are only general templates, not a ready to use solutions. The Message Channel pattern, for example, can be implemented in different ways, as a Datatype Channel, a Point-to-Point Channel, a PublishSubscribe Channel and so on. Each of those channels performs different task, but all of them are based on the same Message Channel pattern template. As mentioned before, the Message Channel pattern can be applied in various ways. The simplest usage of this pattern is the Point-to-Point Channel that connects two different systems directly. If the aim is to deliver the message to more than one receiver at the same time, the Publish-Subscribe Channel can be used. This channel has one input (Publisher) channel and many outputs (Message 43

44 CHAPTER 4. DESIGN PATTERNS IN THE APPLICATION INTEGRATION Endpoints). Each output channel is connected to one system (subscriber). When a message is published, it is being replicated and sent to each of the output channels, where it can be consumed by each of the subscribers. The message can be consumed only once, so after being consumed by every subscriber the message will be removed from the channel and will not be processed again. Another type of pattern that can be used is a Message Router. The concept of the Message Router has been already described as this is one of the basic concepts connected with systems integration (3.4). The Message Router pattern is a response to the problem of delivering the message to the correct recipient. Again as in the case of Message Channel, basing on one pattern, different variants of Message Routers might be built. Those routers variants can be as follows: • Content-Based Router The Content-Based Router (Figure 4.1) reads the message content and basing on it and encoded routing rules directs message to the proper recipient.

Figure 4.1: Design Pattern Content-Based Router
(source: Enterprise Integration Patterns [7])

• Message Filter The Message Filter (Figure 4.2) works in a similar way to the Content-Based Router. It reads message content and checks if it matches the encoded criteria. If it does, it sends the message further, if not the message is discarded.

Figure 4.2: Design Pattern Message Filter
(source: Enterprise Integration Patterns [7])

• Dynamic Router The Dynamic Router (Figure 4.3) is a more flexible variant of the Message Router. It allows the routing rules to be modified by sending control messages to the given port of the router. This makes is more flexible then the router with fixed routing rules and allows to change routing rules dynamically, when such a need arises. It can be useful when a new system is being

CHAPTER 4. DESIGN PATTERNS IN THE APPLICATION INTEGRATION 45

Figure 4.3: Design Pattern Dynamic Router
(source: Enterprise Integration Patterns [7])

connected to the messaging solution and all routers in the system have to be updated with new routing rules. • Recipients List The Recipients List (Figure 4.4) extends the functionality of the ContentBased Router. It works in a similar way to the Publish-Subscribe Channel — inspects the incoming message and basing on the message content it determines the list of the message recipients, then it forwards the message to those recipients. The list of recipients may vary depending on the message content, which also can be specified dynamically.

Figure 4.4: Design Pattern Recipients List
(source: Enterprise Integration Patterns [7])

• Splitter The Splitter (Figure 4.5) is used when an incoming message contains multiple elements, which cannot all be processed in the same way. In that case the message is split into separate elements and each of them is sent independently to an appropriate system to be processed. The Splitter produces one message for each element contained in the incoming message (e.g. if an incoming message contains order data with a list of ordered items, for each item from the list a new message will be produced and published to an appropriate channel). • Aggregator The Aggregator (Figure 4.6) works in the opposite way to the Splitter described above. The Aggregator receives incoming messages and identifies the ones that are correlated with each other. When the complete set of correlated messages has been received, it performs an aggregation of those messages collects the information from each of the correlated message and publishes a new — single — message, containing all of the collected information.

46 CHAPTER 4. DESIGN PATTERNS IN THE APPLICATION INTEGRATION

Figure 4.5: Design Pattern Splitter
(source: Enterprise Integration Patterns [7])

Figure 4.6: Design Pattern Aggregator
(source: Enterprise Integration Patterns [7])

While designing an Aggregator the following things must be set: 1. conditions for correlation (using this conditions the messages will be classified as correlated with each other) 2. completeness condition (when the set of correlated messages is complete) 3. aggregation algorithm (how to process the information from correlated messages and publish them into one message). What is worth mentioning, is that unlike a Content-Based Router, an Aggregator is stateful, i.e. it remembers each of the incoming messages until the complete set of the correlated messages has been received and processed. A simple Content-Based Router is stateless — it only processes the incoming message without any regard to the messages processed earlier (it does not keep information about previously processed messages). • Routing Slip The Routing Slip (Figure 4.7) allows to determine the whole processing path for every message. Each incoming message has a routing slip attached

Figure 4.7: Design Pattern Routing Slip
(source: Enterprise Integration Patterns [7])

CHAPTER 4. DESIGN PATTERNS IN THE APPLICATION INTEGRATION 47

to it, specifying the sequence of the processing steps for this particular message. Every processing component is being wrapped in a special router that reads the routing slip attached to the incoming message and forwards the message to the next processing step from the routing slip. This way, a whole processing chain can be composed and managed from one location. Moreover, the Routing Slip for a new type of messages can be defined if necessary. • Process Manager The Process Manager (Figure 4.8) works in a similar way to the Routing Slip, although it works in a more dynamic way. It forwards the message to the first processing unit and basing on the processing results from this unit and the information about the processing step executed previously it determines the next processing step. The next step is computed dynamically basing on the processing result and information stored by the process manager. The processing path is not fixed as in the case of the Routing Slip but is constructed dynamically by the Process Manager.

Figure 4.8: Design Pattern Process Manager
(source: Enterprise Integration Patterns [7])

• Message Broker The Message Broker (Figure 4.9) is a central component of the integration solution. It connects all integrated system. Within its internals it contains design patterns described before used to effectively route the messages between the connected systems. It reduces the number of message channels required to connect the integrated system. If each pair of the integrated systems, which need to interact with each other, would be connected directly, the number of required channels would increase to an unmanageable number. The Message Broker significantly reduces the number of the required channels and becomes a central component of the system, where all message routing operations are being performed. As it might be observed there are a lot of routing components that can be created basing on the message router pattern. Each of them response to a different kind of need and can be used to solve a particular problem. Those components vary from the simplest ones with fixed routing algorithms to the more complicated that perform routing dynamically basing on the results of the application processing and dynamically builds a processing path for an incoming message. Another pattern that is commonly used in the integration solutions is the Message Translator. As mentioned before, the Message Translator pattern is

48 CHAPTER 4. DESIGN PATTERNS IN THE APPLICATION INTEGRATION

Figure 4.9: Design Pattern Message Broker
(source: Enterprise Integration Patterns [7])

used to reformat the data in such a way that it would fit to the internal data representation model of the other system. Such a need may arise very often as the systems being integrated usually have different internal data representation models. In case of this pattern, as well as in the case of the previous ones, the pattern itself is a base for different variants of the Message Translators that can be used in various situations depending on the faced problem. The idea of the Message Translator concept has been already described in the section devoted to the main concepts of the messaging systems (3.5). Now let us concentrate on the description of the different variants of translators based on the same Message Translator pattern: • Envelope Wrapper The Envelope Wrapper (Figure 4.10) wraps sent data into an envelope in such a way that it fits the message format used by the given messaging system (adds header and body sections, encryption, etc.). After the message arrives at its destination point it is unwrapped by the unwrapper, which withdraws any modifications done by the wrapper and passes the data, as it was initially sent by the sender application, for further processing.

Figure 4.10: Design Pattern Envelope Wrapper
(source: Enterprise Integration Patterns [7])

• Content Enricher The Content Enricher (Figure 4.11) is used when the destination system requires more information than the sender can provide. Content Enricher is able to add additional information to the message fetched from the external data source. After this step the message is forwarded to the next processing component.

CHAPTER 4. DESIGN PATTERNS IN THE APPLICATION INTEGRATION 49

Figure 4.11: Design Pattern Content Enricher
(source: Enterprise Integration Patterns [7])

Figure 4.12: Design Pattern Content Filter
(source: Enterprise Integration Patterns [7])

• Content Filter The Content Filter (Figure 4.12) works in the opposite way to the Content Enricher. When an incoming message contains complex information and only a small part of that information is required by the message receiver a Content Filter removes the obsolete data from the message, leaving only data needed by the message receiver. • Normalizer The Normalizer (Figure 4.13) is a combination of the Message Router and multiple Message Transformers. It is used when integrated systems use different formats of the messages and when each of those formats requires a different type of translation in order to fit into the model used by the messaging system. Information can arrive as an XML document, as a plain text file containing a comma separated data fields, as an Excel file, and so on. Each of those formats requires different processing in order to transform it into the format appropriate to the messaging system. When those messages arrive at the Message Router, they are being forwarded to the appropriate Message Transformer responsible for dealing with this particular data format. The range of accepted incoming message formats might be easily widen by adding a new routing rule to the Message Router and connecting the Router to the additional Message Transformer by the Message Channel. This way the integration solution might be dynamically adapted to the changing business environment and extend its functionality. The above examples of integration patterns show how a single template can be used to solve different kinds of challenges of the same nature, in this case, connecting multiple computer systems. Single template can become the source for various types of components designed to solve different types of problems. Each of those patterns finds an appli-

50 CHAPTER 4. DESIGN PATTERNS IN THE APPLICATION INTEGRATION

Figure 4.13: Design Pattern Normalizer
(source: Enterprise Integration Patterns [7])

cation in designing an integration solution. Using them helps to overcome some of the most commonly encountered problems and makes the designed system more reliable, flexible and easier to support.

Chapter 5

Enterprise Service Bus

Having covered the basic theory of integrating computer systems — styles of integration, messaging based systems and design patterns connected with them — let us move forward to the practical usage of the above concepts. This chapter will describe the integration solution that is currently the most popular and most frequently used in the process of integration — the Enterprise Service Bus (ESB). First, definition of this concept will be provided. Later on a Message Oriented Middleware (MOM), which is another integration solution that is used as a basis of the ESB concept will be described. After that, the ESB aims and capabilities and finally the design patterns connected with this technology will be presented.

5.1

Definition

The Enterprise Service Bus (ESB) is an integration solution that enables integrating systems in a loose-coupled way. It heavily uses open standards such as XML and WebServices. It is based on the Message Oriented Middleware, which provides reliable communication using messages. It simplifies creating computer systems architecture focused on providing business services — services that have a meaning to the business, not implementation services, services that have a meaning to the developers. Currently there is no formal, industry-agreed upon definition of an Enterprise Service Bus. A lot of vendors provide products claiming they are ESB solutions, but there is no precise definition of what such a product should contain. One of the methods of explaining what an Enterprise Service Bus is, is focusing on the capabilities that it is able to provide and the advantages of deploying it into the company. This approach will be taken in this chapter. According to the Gartner Group an ESB [14] consists of the following four things: • Message Oriented Middleware (MOM) 51

52 • Web Services

CHAPTER 5. ENTERPRISE SERVICE BUS

• Intelligent Routing based on content • XML data transformation It is worth keeping in mind that the term ESB does not necessarily have to refer to the software product. It may also beconsidered as: • a pattern • an architectural component • a hardware component (there are devices, which have all the capabilities required in order to be called as an ESB solution) One of the following sections named ”ESB components” (5.6) will describe the meaning of an Enterprise Service Bus as an architectural component.

5.2

Message Oriented Middleware

The Message Oriented Middleware (MOM) is the basis of an Enterprise Service Bus. The main task of a MOM is to provide: • reliable transport • efficient method of communication using messages • end-to-end reliability In a typical Remote Procedure Call (Remote Procedure Invocation) based communication scenario if one of the applications is not available the whole request cannot be performed. The invoking application has to decide what to do next, repeat the attempt within couple of minutes or return an error stating that the request cannot be performed. In the case of the message based communication Message Oriented Middleware is responsible for delivering the message to the proper application. The sender application just puts the message into the queue and leaves the responsibility of delivering it to the MOM [9]. If the receiver application is not running, the message is left in the queue until the receiver will be up again and will fetch it. The left side of the figure (Figure 5.1) presents a typical RPC based communication scenario, while the right side presents message based communication. Application 1 communicates with Application 2, 3 and 4. Let us assume that Application 4 is currently unavailable. There might be dozens of reasons for such a situation — the network connection might be broken, Application 4 may experience some technical problems or simply it may be down because of a maintenance. Application 1 must get data from all three applications, but Application 4 is unavailable. Which action should be taken — repeat the attempt within a couple of minutes or return an error — this decision must be made by the Application 1.

CHAPTER 5. ENTERPRISE SERVICE BUS

53

Figure 5.1: Overview of a communication based on Remote Procedure Call and Message Oriented Middleware Let us apply the same scenario to the right side of the figure. Application 5 communicates with Application 6 and 7. Application 5 does not have to worry about the applications’ accessibility, because it does not communicate directly with them — it is using its message queues, which are always available. If Application 6 or 7 is currently not working, the messages designated for them will not be dropped, but they will be stored in the queues until the Applications will be up again. Message Oriented Middleware guarantees that eventually every message will reach its destination application. Application 5 does not have to be concerned about it.

5.3

Tightly coupled interfaces

To better illustrate the overpowering number of connections in a tightly coupled interfaces scenario, implemented for example by the Remote Procedure Invocation, let us consider the following situation. When every system provides one interface and is connected with every other system, the total number of connections might be computed using the formula [3] n(n−1) , where n is the number of 2 the systems. If there are 5 systems (n = 5), the total number of connections is 10. That amount of connections is still manageable, but if there are 10 systems then there are 45 connections and in the case of 100 systems there are 4950 connections! Of course, not every system has to be connected with every other, but it shows that the growth of that kind of computer software infrastructure is very limited.

5.4

ESB aims

One of the aims of introducing an ESB is to decouple the client (application needing an access to the service) from the service provider. The left side

54

CHAPTER 5. ENTERPRISE SERVICE BUS

Figure 5.2: Tightly coupled system architecture compared to an Enterprise Service Bus architecture of the figure (Figure 5.2) shows the typical model of interconnected systems. Component1, in order to connect to Component4, has to know the following: • Component4 IP address • Component4 port number • Component4 protocol and a client of that protocol (EJB) • Component4 method name and its signature (types of arguments, type of return argument, throws declaration, etc.) The same applies for all of the connections on the above picture. Before one component will be able to connect to another, it has to know a lot about the other party and also, assume that this information will not change. The introduction of the ESB, presented on the right side of the figure 5.2, relieves the client from the need to know, who is providing the service, because the ESB is responsible for creating a communication channel between the client and the service providers. That way, the application does not need to have the integration code, because it will no longer be responsible for creating the TCP/IP connections, reconnecting in case of a communication error, knowing connection information (URL, port number) of the service provider and so on. Thus, introducing an ESB will simplify the design of the client. Furthermore, in a tightly coupled system, any change of the connection data in one of the service providers, also requires a change in all of the clients that are using this service provider. This is no longer true in the case of an ESB, because it is the ESB, which is responsible for storing that information. Let us imagine what would happen if the IP address of Component4 would be changed one day? Then in the case of a tightly-coupled scenario all of the applications connected directly to the Component4 (i.e. Component1 and Component5) will have to be updated. Now, in the case of an ESB scenario only the configuration hold by the Enterprise Service Bus will have to be updated.

CHAPTER 5. ENTERPRISE SERVICE BUS

55

To sum it up, an ESB provides service location transparency, as the client no longer has to know the exact location of the service provider — it is the responsibility of the ESB. An ESB also enables sharing of services. Once implemented service might be used in many projects or by clients from various departments. For example, service providing information about the employee might be used by the systems from the HR, financial or IT department. Also, what an ESB enforces is the separation of the business service from the implementation service. Companies are run by business rules and they should be seen from the point of the business services they provide. Usually, it is the developers, who by creating EJB remote interfaces, WebService interfaces and remote procedure call methods are defining the way, in which the company is perceived. An Enterprise Service Bus enables to change that. Thanks to being able to transform business request into an invocation of the particular implementation services it enables the business — not the developers — to define the way the company is seen by its business partners and customers. The responsibility of the business is to know how to run a company — what the company’s input and output are and what services to provide, while, the responsibility of an ESB is to know what the IP address and TCP port of the application server are, what the protocol used by particular system is, what the signature of a particular method is, etc. If one day, one particular implementation service will be replaced by a new one, the way the company does business will not change, so that cannot enforce any modifications on the other — business partner or customer — side, only the ESB will have to be updated. This enables to decouple the business model — the way the company works — from the implementation.

5.5

ESB capabilities

This approach — describing an ESB by its capabilities — is common in existing literature, another example of such list can be found in IBM Redbook concerning the subject of Enterprise Serivce Bus [12]. 1. Message Transformation — ability to transform structure and format of business request to a service provider An ESB must be able to change the message format to the format accepted by the service provider. 2. Message Routing — ability to send a request to a particular service provider basing on some criteria An ESB must be able to decide to which service providers a particular message should be delivered. 3. Security — ability to protect an ESB from unauthorised access An ESB has to assure than only selected clients have access to particular services.

56

CHAPTER 5. ENTERPRISE SERVICE BUS

4. Transaction Management — ability to provide single unit of work for a business service request by providing a framework for the coordination of multiple resources across multiple services There is no ESB product that provides a transaction management, because an ESB connects multiple different IT systems and it is impossible for it to rollback a distributed transaction after it has been committed by one of the systems. Let us see an example of the problem mentioned above, the possible scenario might look as follows: a) The ESB receives a message, according to a set of rules it knows that it has to connect first with System A then with System B. b) The ESB communicates with System A. c) System A sends some messages into a Message Channel and returns information back to the ESB that it has successfully completed that task. d) The ESB communicates with System B, but System B is unable to perform the request because of internal errors, it returns an error. e) The current situation is that System A has successfully completed the transaction and System B has failed. The ESB cannot rollback the operations (sending messages into a Message Channel) taken by System A, it can only notify System A that this transaction should be rolled back, but how to handle it is delegated to the System A. To sum up, no ESB provides transaction management, because it is impossible for an ESB to rollback operations performed by one of the integrated systems. An ESB can only provide a framework for transaction management. 5. Message Enhancement — ability to modify and add information as required by the service provider It might be necessary for the ESB to add some additional information into the message before it will be delivered to the service provider. That information might come from a database or some other system. 6. Protocol Transformation — ability to accept messages sent using different protocols, i.e. IIOP, SOAP/HTTP, CORBA This capability consists of two aspects: • logical — an ESB must understand the protocol (its semantic and syntax) • physical — an ESB must have a component suited for operating using that protocol, i.e. HTTP server, CORBA client, etc.

CHAPTER 5. ENTERPRISE SERVICE BUS

57

7. Service Mapping — ability to translate a business service into the corresponding service implementation and provide binding and location information It is a mapping performed by an ESB between abstract business service and implementation service (IP address, port, name of the method, etc.) 8. Message Processing — ability to monitor the state of the received request For the client, sending a message to the ESB, the most important thing is that a sent message should never be lost. In order to achieve that the ESB has to monitor the state of the message — is it already processed by the service provider, was the processing successful, is the service provider available, etc. 9. Process Choreography — ability to manage complex business processes that require the coordination of multiple business services to fulfil a single business service request This functionality enables the client to perceive business requests as one single request, while in fact its execution may trigger the execution of multiple business services. It is usually implemented as a Business Process Execution Language (BPEL), which is a language enabling business process modelling [9]. 10. Service Orchestration — ability to manage the coordination of multiple implementation services The difference between previously described process choreography and service orchestration is the type of service being managed — in case of service orchestration it is an implementation service, in case of process choreography it is a business service. One of the reasons for not having a formal definition of the ESB is that every application has different requirements for the capabilities provided by an ESB. For example, in some integration projects a transaction manager or a BPEL engine might not be required, but there might be a need for message routing and message transformation functionality. There are products on the market, which do not have a support for transactions and BPEL, but do have support for message routing and transformation. They claim themselves to be an ESB solutions and, what is the most important, they would fulfil requirements of such an integration project. As it has been depicted the term ESB has a very broad meaning and it is being used by the software vendors to describe products providing differential functionality.

5.6

ESB components

An ESB does not have to refer only to a software product, but can also refer to an architecture component. This section will be devoted to this meaning of the term ESB.

58

CHAPTER 5. ENTERPRISE SERVICE BUS

While designing an integration solution an ESB should not be perceived from the perspective of one particular software and its capabilities, but rather as a component, which functionality might be provided by multiple products available on the market. This approach to an integration does not tie the company deploying an Enterprise Service Bus to one concrete solution, but enables the company to have products from various vendors cooperating. One of the conceptual models of such an ESB architecture is presented on the figure (Figure 5.3). It consists of the following logical components: 1. Mediator The Mediator is the most important component in an ESB. The crucial functionality provided by this component is the routing, communication and protocol transformation. A product not having the above cannot be considered as an ESB solution. The Mediator is used as an entry point for the ESB — messages sent to the ESB are received and processed by this component. It might also be responsible for message transformation and enhancement. In order to enable reliable and secure processing of requests it must support security, error handling and transaction management. 2. Service registry Service registry is a component that provides the functionality of the service mapping. 3. Choreographer The role of the choreographer is to enable process choreography — coordination of business processes. This component is actually a client of an ESB. It has the knowledge about the sequence of business services that must be called in order to perform one — sophisticated — business request. If the mediator decides (according to its rules) that this particular request needs to be choreographed, it will be forwarded it to this component. The Choreographer after looking up its configuration will invoke proper service providers by sending messages — just like an ESB client — to the mediator. 4. Rules engine The Rules engine is an additional component, which may not be required in some integration projects. This component enables to have a rule-based routing. Its functionality includes: message routing, message transformation, security and transaction management.

5.7

Open source ESB products

Among dozens of ESB products available of the market there are two open source solutions, each with different characteristics: • The Mule is a Lightweight Messaging Middleware Framework. It introduces the concept of Universal Messaging Objects (UMO) — just Plain Old Java Objects, which are responsible for communicating with the service

CHAPTER 5. ENTERPRISE SERVICE BUS

59

Figure 5.3: Conceptual overview of an Enterprise Service Bus components

providers, performing transformation and routing of the messages. They are deployed in a mule container — a framework enabling the communication between endpoints. • The ServiceMix is a JBI-compliant ESB. It consists of the Normalized Message Router with a bunch of components responsible for protocol transformations and ESB capabilities (rules engine, etc.)

5.8
5.8.1

ESB integration patterns
VETO pattern

Figure 5.4: ESB Design Pattern VETO VETO (Figure 5.4) stands for Validate, Enrich, Transform, Operate. It is a widely used integration pattern in ESB solutions. The VETO pattern [3] ensures that data exchanged in an ESB will be consistent and valid. Each component in the VETO pattern might be implemented as a separate service, which might be configured and modified independently of any other component.

60 Validate

CHAPTER 5. ENTERPRISE SERVICE BUS

The aim of the validate step is to ensure that messages received by the service provider will have proper syntax and semantics. This step should be performed independently — not inside the service provider because that solution would limit the re-usability of validation and complicate any further modifications of it. Moreover, implementing validation as a separate component would ensure that every message that gets to the service will be in a proper format, thus would simplify the design of service provider and enable the Operate step to focus on business logic. The simplest way of validating an incoming message is to check whether the message is a well-formed XML document and conforms to the XML schema or WSDL, but there are also other possibilities, like for example validation scripts. Enrich The aim of the enrich step is to add some additional data to the message content that would be needed by the service provider, for example, information about the customer, who has placed order. That information might be fetched from the database or might be the result of invoking another service. Transform The aim of the transform step is to change the message format to the one accepted by the service provider. This step might transform the message into an internal message format of the service provider, releasing the Operate step from the need to perform this task and therefore increasing its efficiency. Operate The aim of the operate step is to invoke target service or to interact in some way with the target application.

5.8.2

VETOR pattern

The VETOR pattern [3] is a VETO pattern which introduces a new component placed right before the Operate step — the Router. The aim of this step is to decide whether a message should be delivered to the service or not. The router might be implemented as a part of the transform component or, more preferably, as a separate service.

5.8.3

Two-step XRef pattern

In an ESB two types of transformations are taking place: structure and content transformations. The aim of the structure transformation is to change the format of a message, while the aim of the content transformation is to enrich a message with some additional data, usually fetched from the database. This process is

CHAPTER 5. ENTERPRISE SERVICE BUS

61

usually performed in one step, when the output from one step is used as an input to the following step: • XSLT transformation • XPath query • JDBC query • SQL statement The concept of the Two-step XRef pattern [3] (Figure 5.5) is to create two separated components responsible for only one type of operations: • XML parsing: XSLT transformation and XPath query • Database lookup: JDBC query and SQL statement This approach has a lot of advantages over the previous model: • a better code re-usability: components for XML parsing and database lookup might be used in multiple different projects • an easier and quicker development: both components might be developed simultaneously by different teams • loose coupling: problems with the database does not affect the operation of the XML parsing component

Figure 5.5: Comparison of the internals of a typical transformer and Two-step XRef transformer

62

CHAPTER 5. ENTERPRISE SERVICE BUS

5.8.4

Forward Cache Integration pattern

After introducing the ESB into a company, the need of having a single application, which will gather the information from all of the integrated systems and present them in a coherent visible form, quite often arises. Portal applications perform such a role. They enable pulling data from multiple sources like other systems and databases and representing them in a unified way through web pages. There are very useful, because the user does no longer have to seek for data in many systems, which might provide various user interfaces (web pages, command line, etc.), might require different credentials and so on. All the information is available in one place. Despite mentioned advantages of the portal applications they also invite new challenges. In order to present data on the web page the portal must be able to get it within a couple of seconds. For a lot of integrated systems it might not be possible to fulfil that requirement. There might be applications designed as terminal applications, used by a single user, which are not adjusted to handle the throughput required by the portal application. Moreover, systems accessed by the portal application must be available 24/7. For some, which were designed as desktop programs run on a personal computer, this requirement might not be fulfilled as they might require periodical restarts and are not resistant to the hardware problems of the computer on which they are running. Because of geographical separation data from the integrated systems is available only when there is a network connection between them and the portal system. When the network connection is broken, the portal application will not be able to present any data to the user. Thus, it becomes crucial to have a properly working network infrastructure. Therefore, it is very important to keep in mind that systems which are about to be integrated will face a lot of new challenges and problems, which might not have been considered by the authors of them. One of the ESB services, which might help with solving some of those problems is the Cache Service. The task of this service is to store the results of service invocations returned from service providers. This service combined with the Forward Cache Integration pattern [3] enables the portal application to access the data directly from the Cache Service, once it has been already presented, even if the system, which supplied that data, will be temporary unavailable. There are two possible scenarios to implement this solution: • using the publish-and-subscribe model — constantly inform the Cache Service about the changes in data • using message routing — duplicate every response from the integrated system to the Cache Service Publish-and-subscribe model In this scenario every change of the data held by an integrated system will cause sending a message with a set of changes to a message topic. The Cache Service will be a subscriber of that topic. This solution is only suitable for small computer

CHAPTER 5. ENTERPRISE SERVICE BUS

63

software infrastructures with systems not frequently changing data. It is not hard to imagine what would happen if there would be multiple integrated systems constantly changing their own data, then most of the traffic would be consumed by update-messages making an ESB incapable of handling any regular-messages. Message routing This scenario assumes the usage of one of the ESB main components the router. Every response, before getting back to a portal application, should also be sent to the Cache Service. In that way, it will have a copy of every information that used to be presented by the portal application and in case of inaccessibility of an integrated application that information might be supplied by the Cache Service.

Chapter 6

Case study: Messaging systems work principles

Having covered the basic concepts required to get a better understanding of the application integration topic, these concepts will be gathered together in form of a case study that would show how to use them when solving a business problem. The topic of this case study will be an imaginary integration task of an international company producing toys. It has been created by the authors of this thesis in order to illustrate the theoretical background, presented so far, from a more practical point of view. This company has multiple systems located at distant geographical locations. The company headquarters are located in Europe, along with storage facilities. The factories producing toys are located in Asia. The Logistic System is divided into two parts: one responsible for delivering products from factories to the storage facilities and a second one responsible for delivering ordered products to the final costumers. The company also has an Internet shop, established in order to widen the potential number of customers and make its products available worldwide. The computer systems infrastructure is comprised of the following: • Internet Shop — responsible for the interaction with the user and placing (and confirming) orders • Orders Fulfilment System — responsible for fulfiling the orders • Storage System — responsible for providing information about product supplies • Pricing System — manages the prices of all products available for purchase • Loyalty System — responsible for storing information about discounts for those customers, who purchase most frequently and/or purchase large quantities of products 65

66

CHAPTER 6. CASE STUDY: MESSAGING SYSTEMS WORK PRINCIPLES

• Payment Check System — responsible for checking the status of the payments for ordered products • Supply System — responsible for sending orders to the factory in case the supplies of a given product would run low • Logistics System — responsible for delivering packages with ordered products to the customers • Postponed Orders Fulfilment System — responsible for fulfiling orders that could not have been finalised earlier because of the lack of products at the storage The task at hand is to connect all those system into one big business entity by using the messaging system. The high level design view on the schema of the systems after the integration is presented on the figure 6.1. Of course, each of the systems has to have a number of endpoints attached, so it would be capable of sending and receiving messages. Message Channels between the systems must also be set so that the communication can take place. The connections between the systems must be determined earlier (there is no point in connecting two systems using a Message Channel if no communication between them will ever occur). During this process a possible location for placing additional components, such as Message Routers or Message Transformers, should also be determined. For every router the set of rules must be set by which the router will determine the destination point of the incoming message. If there is a Message Transformer, the rules for message transformation must also be set. It also must be decided whether this solution would use a synchronous or an asynchronous communication model. The system being the subject of this study should be resistant to the communication failures and as flexible as possible. Also we do not want the Supply system to stop working and receiving requests from the Storage System, if it would not get the acknowledgement of the received order from the factory, and so on. It this case the most suitable model of communication would be the asynchronous one. The usage of this model also means that much more effort must be put into the design and implementation of the solution, but it will assure that the final solution will operate in the desired way. The detailed description of the design and implementation of this case study, covering all possible issues, could easily cover the whole volume of this thesis. Because our goal here is to only give a brief taste of the integration task and show the practical usage of the concepts described previously, we will just give the brief description of the integration solution, not focus on the technical details. First, let us take a closer look at the simplest scenario, involving an order made by the Internet Shop with credit card payment, when there is no need to order the products from the factory and wait for the fulfilment of the order because the products are available in the desired quantity at the store. This scenario is presented on the sequence diagram — figure 6.2. The user visits the web site of the Internet Store, logs in using his/her user name and password, browses through the list of available products, selects the

CHAPTER 6. CASE STUDY: MESSAGING SYSTEMS WORK PRINCIPLES

67

Figure 6.1: Case study systems one that he/she is interested in and places an order. After the order is being placed, the Internet Shop sends a request to the Storage System to check if the selected products are available at the moment. As mentioned before, in our scenario we assume that the ordered products are available. In the opposite case, the Storage System would send a request to order them to the Supply System. Then the Supply System, after the amount of products needed would reach a specified quantity, sends a request to the factory to make those goods produced and delivered to the storage in Europe and then forwards the order to

68

CHAPTER 6. CASE STUDY: MESSAGING SYSTEMS WORK PRINCIPLES

Figure 6.2: Case study processing sequence

the Postponed Orders Fulfilment System. When the Storage System verifies that the requested products are available, it sends back the response to the Internet Shop application. The Internet Shop also sends a request to the Pricing System to check the prices of the products. The Pricing system sends a request to the Loyalty System, which checks if the user, placing the order, has any discounts. After receiving the response it calculates the prices and sends the response back to the Internet Shop. Having gathered all data the Internet shop application displays information about the order along with calculated prices. When the user confirms the placement of the order, it is passed to the Order Fulfilment System, which manages the fulfilment of placed orders. Depending on the selected payment method the system sends a message to the Payment Check System in order to check if the payment has been made, and the order can be processed further (in the case of a money transfer, payment by credit card) or send request directly to the Storage System (in case of payment upon delivery) to prepare the package containing ordered product (the same request is being sent when the message is received from the Payment Check System in case of the payment methods mentioned above). When the Order Fulfilment System receives a message from the Storage System that the package is ready for a shipment it sends a message to the user that the order has been completed and is ready for shipment (using e-mail address

CHAPTER 6. CASE STUDY: MESSAGING SYSTEMS WORK PRINCIPLES

69

of the user). The message is also being sent to the Logistics System to add the package to the list of packages to be picked up and delivered on the next day. When the package is sent out to the client the message is being sent to the Order Fulfilment System to notify that the package has been sent, upon receiving this message the system sends an notification e-mail to the user informing that the package has been sent and stating the approximate delivery time. Sending this notification e-mail finishes the order processing by the system. As it can be seen, even in the case of the simplest scenario the interaction between the involved systems is quite complex. There are many systems involved in the process of exchanging different types of information. It is worth keeping in mind that the data being sent is the subject of many changes and prepossessing before it can be consumed by the next system in the processing chain (different data formats, internal data model of the applications, and so on). Designing the integration solution for this system would require the usage of all of the components explained in the previous chapters, i.e. Message Routers, Message Translators, Message Endpoints. The Message Routers could be used to determine the destination of a message in the case when one system can send messages to different receivers. The Message Translators could be used to translate data contained in those messages so that they would fit the internal data model of the receiver system. The schema of the systems integrated by the messaging system and incorporating the elements mentioned above is presented on the figure 6.3. Although the diagram 6.3 may look simple and straightforward, it is, in fact, an example of a badly — tightly coupled — designed computer software infrastructure. One of the disadvantages of this solution is the unmanageable number of message channels (depicted on the figure as arrows). Although, the major disadvantage is the fact that an application, in order to communicate with other applications, must know a lot of details about it, like for example message channel addresses and message formats. Moreover, every time a message format changes in one application, all applications communicating with that application also have to be updated. For example, if the Storage System will change the format of the date field, applications such as the Logistics System and the Order Fulfilment System also would have to change the format of the messages that they are sending to the Storage System. The solution of the previously mentioned problems might be the usage of integration patterns — the Message Router (3.4) and the Message Transformer (3.5). Figure 6.4 presents a new architecture utilising those concepts. Despite the fact that the amount of message channels has increased, this solution enables greater decoupling of the applications. The knowledge about the format of messages accepted by the system is now not hard-coded inside each application, but is delegated to a new, intermediary component — the Message Transformer (depicted on the figure as the letter T). Also decisions about the routing of messages are not taken by each application, but by the Message Router (depicted on the figure as the letter R). This approach introduces a greater level of loosely coupleness — the message format of each application might be changed independently of the others. Each system shown on the figure 6.4 has been enriched by the endpoints

70

CHAPTER 6. CASE STUDY: MESSAGING SYSTEMS WORK PRINCIPLES

Figure 6.3: Case study: Message Channel scenario that enable the communication between the system and the messaging system. The number of endpoints corresponds to the number of channels from which the given system can receive messages or to which it can send them. Message Routers are used to direct the messages to their destinations basing on given business rules (e.g. name of the destination system placed within the message header). The Router connected to the Internet Shop channel decides whether to send a message to the Storage System (checks the availability of the product) or to the Pricing System (gets the prices for a given product) or to the Orders Fulfilment System, which passes it (a message containing information about a placed order) on for further fulfilment. The router connected to the Order Fulfilment System channel routes messages either to the Storage System (a message containing a request to create the package for shipment) or to the Payment System (checks whether the payment for the ordered products has been made) or to the Internet Shop (notification messages about the various stages of order fulfilment). Message Transformers are used to transform Messages to the format readable by

CHAPTER 6. CASE STUDY: MESSAGING SYSTEMS WORK PRINCIPLES

71

Figure 6.4: Case study: Message Channel with Router and Translator scenario the recipient system (e.g. Messages sent from the Storage System to the Internet Shop need to be transformed by the Transformer component in order to be correctly read by the destination system, in particular in the case of the Internet Shop). Let us assume that the Order Fulfilment System stores information in such a way that the receivers name and surname are combined together in the field receiver and the address is stored in the field package destination. Thus, the role of the transformer would be to extract the receivers name and surname and put them in the field receiver and then extract the information about the destination address and put it in the field package destination. Only after those transformations the message can be sent further to the Order Fulfilment System. After those changes it can be assured the data will be read and interpreted correctly by the system and will not cause any errors during its processing. The problem of the unmanageable number of message channels is resolved by the introduction of an Enterprise Serivce Bus (Figure 6.5). In this scenario, each of the systems communicates only with the ESB, which is responsible for mes-

72

CHAPTER 6. CASE STUDY: MESSAGING SYSTEMS WORK PRINCIPLES

Figure 6.5: Case study: Enterprise Service Bus scenario sage transformation and routing. This approach simplifies the development and management of applications, because there is only one message channel between the application and the ESB. Moreover, an Enterprise Service Bus provides wide range of adapters — components for accessing an ESB, which removes the need for an application to know details of communication with the ESB. An Enterprise Service Bus incorporates the implementation of multiple integration design patterns, such as the Message Channel (3.3), the Message Router (3.4), the Message Transformer (3.5), the Message Endpoint (3.6), etc. But the internals of that implementation are hidden from the applications using an ESB. This simplifies their design and makes the process of integration easier. The example described above should give a good overview on the usage and practical application of the concepts presented in the previous chapters. Using the most basic components such as a Message Router and a Message Translator, a complex integration solution might be designed.

Chapter 7

Implementation

This chapter will cover the details of the implementation of the ESB platform named pESB. It was developed by the authors of this paper as an internal part of this master thesis. First, the basic concept behind this product will be described, its technology and architecture. Later, the internal working of it, illustrated by the sequence diagram, will be presented. Having covered the essential information, a real life example, which will familiarise the reader with the process of configuring the ESB, along with source code snippets, will be introduced. Finally, the problems that occurred during the implementation will be presented.

7.1

The origin of the name

pESB is an acronym for Polish-Japanese Institute of Information Technology Enterprise Service Bus. Because the authors of pESB belong to a group of busy people that do not want to waste time typing long names, the name pESB will be used throughout this chapter.

7.2

Concept

Being aware of our limited amount of time and available resources we decided to create an integration solution that will be lightweight and easy to use in the first place. We knew that we would not be able to create a tool that will compete with integration products existing on the market for a couple of years. Thus, we decided to create an integration product that will provide simple functionality, but will have an architecture that will enable easy scalability and further development. The main concept of this approach to an ESB solution was to base implementation on the standards and technology already available on the market, like Java Enterprise Edition (EJB, JMS), XML, WebServices, etc. This approach is very different compared to other existing ESB solutions. pESB takes advantage of 73

74

CHAPTER 7. IMPLEMENTATION

the services provided by the application server like security management, transaction management, pool and resource management, etc. It does not come with its own Message Queue implementation. In our opinion this should be treated as an advantage, because we do not force users to use any particular solutions. Nowadays, every company thinking about integration solutions has already an infrastructure that might be used. Furthermore, our solution does not reinvent the wheel. The products providing those features have been available on the market for a long time, and certainly the authors of those products had to overcome a lot of problems and solve a lot of issues that appeared during the usage of their products by the customers. pESB was designed as a multi-tier application (Figure 7.1). The basis of this solution is the Message Bus. Its main responsibility is to provide reliable method of communication between multiple points using data channels. The next layer is the Application server providing a lot of facilities, which an application running in a container can take advantage of. Mentioned facilities are as follows: • Transaction management — container is responsible for starting transactions, committing, and automatic rollback in case of an error • Security management — authentication and authorisation done by the container • Pool management — the container is also responsible for managing pools of Enterprise Java Beans instances, pools of connections to the resources (allocating new, removing not used) • Resource management — the container provides an application with the connectivity to the resource like databases or JMS queues • Multi-threading — the container is responsible for starting threads and monitoring their work pESB as an application running in the container might take advantage of all those facilities mentioned above.

7.3

Technology

pESB is an enterprise application using the set of state-of-the-art Java technologies: • Java 6 — runtime Java source code compilation • EJB 3.0 — no XML descriptors, the whole configuration written in annotations pESB has been implemented on Glassfish V2 application server, but since there are no XML descriptors specific to the chosen application server porting pESB to any other EJB3-compliant application server would not require much effort.

CHAPTER 7. IMPLEMENTATION

75

Figure 7.1: Integration architecture layers

7.4

Classification

According to the classification presented in the Forrester Wave comprehensive report about Enterprise Service Bus products available on the market[13] customer requirements might be yielded into two segments:

• ”keep it simple” • ”I want it all now”

For the first group of customers the most important thing is that the solution should be simple as possible in order to enable low-cost integration. Also, it should have a plug-in architecture to enable easy customisation to customer’s needs. The products from the second group, on the other hand, should feature wide range of additional services like Business process management, process simulation, monitoring or optimisation. pESB will suit customers from the first segment.

76

CHAPTER 7. IMPLEMENTATION

7.5

Architecture

The main focus in the development of pESB has been put not on the optimisation of particular methods or functionalities but on having an architecture, which will enable scalability, distribution, reliability and security of the solution. One of the ways to achieve that goal was to use asynchronous processing over synchronous. The internal communication between components of the pESB is done using asynchronous messages (Figure 7.2). The message processed by one component is put into the queue of the other component. Every message is persisted. The combination of those two factors provides scalability and reliability. Message queues enable to have multiple consumers running on different computers. Thus, the number of consumers may change dynamically, new consumers might be added and removed on the fly. This feature provides scalability, pESB might be easily adjusted to the growing requirements of processing greater amount of the messages, simply by adding new application servers.

Figure 7.2: pESB architecture The overall process of communication is depicted on figure (Figure 7.3). The client communicates with the ESB using an Agent, which is a software com-

CHAPTER 7. IMPLEMENTATION

77

ponent providing an API to access the pESB. From the client’s point of view (depicted on the figure as System1 and System2) the whole complexity of message based communication, asynchronous processing, XML documents, etc. is hidden. pESB except providing an interface for receiving and fetching messages also provides a method of dynamic configuration (interface Config) using Java API.

Figure 7.3: pESB communication architecture

7.6

Processing sequence

In the presented processing sequence (Figure 7.4) there are three participants: • a Sender system, named System1 • the pESB • a Receiver system, named System2 System1 is communicating with System2. Both of the systems are using agents to communicate with the integration platform. The agent is a software component responsible for the whole communication with the pESB. The details of that communication, whether it is done using EJB or SOAP/HTTP is hidden from the client (System1 and System2). Moreover, to simplify the design of System1 and System2 clients they operate on Data Transfer Objects (DTO), not on XML documents. These are the actions that happen when System1 sends a message to the pESB and when System2 is the set as a receiver of that message:

78

CHAPTER 7. IMPLEMENTATION

Figure 7.4: pESB processing sequence 1. System1 invokes the send method from the Agent passing as an argument Data Transfer Object, which it would like to send to the ESB. 2. System1’s Agent receives the DTO and creates an XML message out of the DTO content, then invokes the method from the remote interface of pESB using EJB and passes that XML document as an argument. 3. The ReceiverBean in pESB receives that XML document and puts it into an input queue and returns information back to the Agent about the status of that operation. 4. After some time... 5. The Message-Driven Bean named TransformerBean in pESB fetches message from the queue, asks the Transform and routing engine to perform transformation and routing of that message, then: • If the message is already processed by pESB and is about to the routed to the destination system, sends it to the output queue. • If not, sends it once again to the input queue. 6. After some time... 7. System2 wants to fetch new messages, it invokes the receive method from the Agent. 8. System2’s Agent invokes the method from the remote interface of pESB using EJB.

CHAPTER 7. IMPLEMENTATION

79

9. The FetcherBean in pESB checks whether there are any new messages waiting in the output queue for System2, if there is a new message, fetches it and returns back to the System2’s Agent. 10. System2’s Agent creates a new Data Transformer Object using the content from the XML document and returns that new object back to System2. It is worth mentioning that all the operations performed inside the Transform routing unit are performed in a transaction. This means that in case of an error no message will ever be dropped.

7.7

Configuration

All the configuration is done in the runtime, no restart or reload is required after changing the configuration, also no XML file editing is required. The whole configuration might be changed using a Java API. Furthermore, even a Java code provided by the user is compiled on the fly and functionality provided by it is available right away. Each component in the pESB — router or transformer — might be configured in multiple ways. That is, the transformer might be configured using either an XSLT file or a Java code. It is up to the user which method of configuration he/she will choose. In the case of a Java code method, the source code provided by the user is compiled on the fly, during the process of configuring. This approach has a lot of advantages over dynamic execution of the code. The compilation process can detect a lot of errors, which in the case of a dynamic execution would be found out only at the runtime. Moreover, the user in order to develop a component might use an IDE of his own — Eclipse, Netbeans or IDEA, etc., which gives him/her the possibility to avoid common programming mistakes like misspelling variable names, methods, using wrong types and so on. The process of configuration pESB is performed using a Java API. We believe that this solution provides the greatest flexibility, because it does not enforce the way the configuration data will be stored. In this approach, it is possible to save the configuration in an XML file, a LDAP directory, a relational database, etc. The only thing that must be done is the creation of the import tool, which will read the data from the particular medium (XML, LDAP, database), parse it and invoke proper methods from the Java API. Moreover, this approach also enables to have some addition logic in the import tool itself, which will be called before using the configuration Java API. Furthermore, it is easier to modify an import tool (XML or database) than a system (pESB).

7.7.1

Configuration model

The model of pESB configuration (Figure 7.5) incorporates two components: • System • Transform Routing Unit (TRU)

80

CHAPTER 7. IMPLEMENTATION

The System represents a system taking part in the communication. Every System has an input — responsible for receiving messages from the application and forwarding them to the pESB, and an output — responsible for receiving messages from the pESB and forwarding them to the application. The Transform Routing Unit represents a special unit responsible for transformation and/or routing. Every TRU has an input and an output, which can be either another TRU or a system. This approach gives great flexibility to the one responsible for configuring pESB to particular integration needs. A message from the output of one TRU might be sent to the input of another TRU. Moreover, this unit might be considered as an independent component — with input and output and business logic responsible for transformation and routing, what enables re-usage of that component in different integration projects or by multiple systems.

Figure 7.5: pESB configuration model

7.7.2

Configuration example

To better understand the process of configuring pESB let us present an example of the pESB configuration. The example is based on the case study described in the chapter ”Case Study” (6). Because the case study is complex and involves multiple systems, only an excerpt of the whole integration solution will be presented in this section. The presented integration solution will only involve the following systems: 1. Internet Shop — responsible for the interaction with the user and placing (and confirming) orders

CHAPTER 7. IMPLEMENTATION

81

2. Orders Fulfilment System (Order System) — responsible for fulfilling the orders 3. Storage System — responsible for providing information about product supplies One of the requirements of the proposed solution is that it must make an application independent of other applications’ message formats. A message format change of one application cannot enforce the change of message formats of applications communicating with it. For example, an Internet Shop message format change should not enforce a change in the Orders Fulfilment System message format. Moreover, the business logic responsible for making a decision where particular message should be forwarded must be separated from the application. This should be provided by the component, which might be changed independently of any other application. The above requirements can be fulfilled by the pESB using the following configuration (Figure 7.6): 1. Systems — the input and output of the messages: • InternetShop • OrderSystem • StorageSystem 2. Transform Routing Units — responsible for transformation of message format: • InternetShop • OrderSystem • StorageSystem 3. Transform Routing Unit named Router — responsible for routing of the messages basing on business rules

7.7.3

More on transformers and routers

A component configuration is designed in a plug-in architecture, which means that it is possible to extend a basic set of configuration methods with new ones. Transformer and router are actually names of the interfaces, thus in order to create new method of configuration — one must simply create a new class that will implement the interface. pESB comes with two implementation of the transformer interface — the Java code and the XSLT file and one implementation of the router interface — the Java code. This feature gives the user the possibility to decide using which technology particular component will be configured.

82

CHAPTER 7. IMPLEMENTATION

Figure 7.6: pESB configuration example

7.7.4

Performing the configuration

The process of configuring pESB consists of two steps: • creating Transform Routing Units (with Routers and Transformers) and Systems • creating connections between Systems and TRUs The following snippet of code creates a Transform Routing Unit object with XSLT Transformer:
String xsltCode = "<?xml ..."; TransformerDTO transformerDTO = new TransformerDTO(TransformerDTO.TYPE_XSLT); transformerDTO.setParameter(TransformerDTO.CODE, xsltCode); TransformRoutingUnitDTO truDTO = new TransformRoutingUnitDTO("OrderSystem", transformerDTO, routerDTO);

Once Transform Routing Unit object has been created (variable truDTO) it must be registered at pESB:
int result1 = bean.registerTransformRoutingUnit(truDTO); // (bean is EJB remote interface of pESB configuration bean)

Variable result contains the status of registration, 0 in case of a success and −1 in case of an error. After creating the Transform Routing Units two System, systems must be created:
SystemDTO systemDTO = new SystemDTO("OrderSystem"); int result2 = bean.registerSystem(systemDTO);

CHAPTER 7. IMPLEMENTATION

83

The next step in the process of configuration is creating links between Transform Routing Units and Systems. The following snippet creates the appropriate connections:
int result3 = bean.setTransformRoutingUnitInput(truDTO, systemDTO); int result4 = bean.setTransformRoutingUnitOutput(truDTO, systemDTO); int result5 = bean.setSystemOutput(systemDTO, truDTO); int result6 = bean.setSystemInput(systemDTO, truDTO);

It is worth mentioning that the links should be created on both sides, which in context of the above example means that the OrderSystem output and input must be connected with the TRU, and the TRU input and output must be connected with OrderSystem.

7.8

Integration design patterns supported by pESB

This implementation of the Enterprise Service Bus utilises the following integration design patterns: • Message (3.1) — data record used to exchange data between an application and pESB and also between internal components of pESB • Message Channel (3.3) — in order to provide reliability (the guarantee that no message will ever be lost) all the communication between internal components of the pESB is performed using JMS Queues • Message Router (3.4) and Message Translator (3.5) — this functionality is provided by a user-defined Router and Transformer, which is a part of a Transform Routing Unit (7.7.1). Because pESB does not put any restrictions on the operations that might be performed on the received message inside those components, a lot of variants of those two integration design patterns might be created, such as the following: – Fixed Router, Content-Based Router, Context-Based Router (3.4) – Message Filter, Recipients List (4) – Content Enricher, Content Filter (4) Despite the simplicity of this integration solution it is also possible to create other, more complex, integration design patterns, such as the following: • Normalizer (4) — this design pattern has been actually used to solve the case study integration problem (7.7.2), the solution consists of multiple Transform Routing Units performing the role of Message Translators and one central Transform Routing Unit performing the role of a Message Router • Message Broker (4) — pESB itself might be perceived as a Message Broker pattern. pESB is a central component of the system where all message

84

CHAPTER 7. IMPLEMENTATION

routing operations are being performed. It also reduces the number of message channels, because an application in order to be able to communicate with other systems needs only one message channel — to the pESB.

7.9

Problems encountered during the implementation

During the implementation phase we realised that it is a very complex task to implement functionality enabling receiving of the messages in the push mode — mainly because, such a method requires a distributed transaction. pESB after notifying the client that there is a pending message would have to wait until the client fetches that message. This would have to be done in a transaction, because the client currently may experience some problems and would not be able to fetch a message (or there might be a timeout on communication or some other problem). In such situations the message should be returned back to the queue. Having in mind the fact that pESB is an application running in an EJB container, where transactions are handled automatically by the transaction manager, this could be a challenging task. Moreover, the mechanism of notification would require pESB to connect to the client, In order to facility this the client would have to have some port open, on which pESB could reach him. That in turn would put additional requirement on the client, not only on the design of it, but also on configuration (i.e. it would not be possible to use the client behind a firewall). The solution to the above might seem to be initialising the connection to the client from the pESB side, but that would complicate pESB and would require pESB to have a mechanism to reinitializing the connection after it has been broken, because of the problems experienced by the client. Again, keeping in mind the fact that pESB is the application running in an EJB container that hinders the manual (performed by an application) management and sharing of the connections, because resource management is being done by the application server, not directly by an application, it makes that solution very difficult to implement. Despite a lot of advantages of the push mode approach we have decided to implement the pull mode.

Chapter 8

Summary

The integration problem has been well-known since the 70s. Since that time a lot of technologies and integration styles have been in use, some of them have been described in chapter 2. The latest approach to the integration problem is an Enterprise Service Bus, based on the concept of messages and asynchronous communication. Our software project — the implementation of an ESB — tries to fulfil the gap on the market of such products. The main focus in its development was put on the usage of well-known, well-tested, already existing products and technologies available on the market, such as: Java, Java Enterprise Edition (Java EE) application server performing the role of a container (hosting platform) for an Enterprise Java Bean (EJB) application, Java Message Service (JMS) and XML — because, we believe that those technologies combined with the deliberated architecture will enable our product to be a reliable, efficient, secure and — the most important thing in the era of constantly growing IT systems — scalable integration solution. The domain of this paper — integration — usually concerns large IT systems, quite often systems, which are crucial for the operation of the company, thus introducing a solution, which will not be of high quality is not an option. A high quality solution is the one which is reliable, secure, easily scalable, distributed and having high performance. Only such solutions might be created as an answer for the integration problem. Because of the reasons mentioned at the beginning, there is no place for not well tested, buggy, unreliable software products. This, as it is well known in IT, is very difficult to achieve. To help in solving this challenging task integration, the integration design patterns might come handy. As it has already been mentioned in the introduction to this thesis, due to the time and resources limitations our effort has been directed in such a way so that the created application would fit into an empty niche. This niche creates a space for a lightweight solution suitable to solve most common integration problems. For obvious reasons, it was not possible to create a fully functional software in such a short time that would provide functionality comparable to the solutions delivered by large software vendors. Never the less, aiming for this niche made it possible to create a fully functional software, and, at the same time, gaining 85

86

CHAPTER 8. SUMMARY

knowledge about the topic of application integration. The main assumptions that have been made for the application have been outlined in the first chapter of this paper. Now when it is finished and working, we can say that those assumptions have been fulfilled. The ease of use and simplicity have been our main goals and they have been achieved. We not only managed to describe the topic that we have decided to undertake from the theoretical point of view, but we also used this theoretical knowledge while creating our application. We also managed to apply designed patterns presented in this paper in a practical way, in order to obtain the desired results. Due to the shortage of time our application lacks a graphical user interface that would simplify its usage. Creating such a user interface would significantly speed up the process of designing an integration solution. This feature might be one of the possible ways of the further development of this application. Another way is to enrich its functionality by adding a larger number of design patterns for the user to choose from. While adding those new patterns, the overall simplicity of the solution should be kept in mind, as it is one of the most important features of this application, that cannot be lost during its further development. The application that has been created during our work can be seen as a base system, which can be further extended in various ways — to improve its functionality and ease of use. Sample ways in which it can be extended have been already pointed out in the previous paragraph. Those are of course only the extensions that in our opinion would contribute most significantly to the development of our system. Other possibilities and ways of developing our application can also be applied. By gradually developing it and adding new features, a very powerful integration tool can be created, which can be used to solve not only the basic integration problems but also more sophisticated ones. Of course, it will not be able to compete with solutions provided by the large IT companies, but over time it can become an interesting alternative to complicated and expensive solutions. Our work proved that the knowledge of integration design patterns is essential — not only for the designers and developers of integration solutions, but also for the people involved in creating IT products. Nowadays, even the simplest applications, created to simplify everyday routines may one day be integrated into a large computer software infrastructure. Awareness of the problems covered in this thesis will enable to design and implement solutions — flexible enough to one day become a part of some other — larger — system.

Bibliography

[1] [2] [3] [4]

Christopher Alexander. A Pattern Language: Towns, Buildings, Construction. Oxford University Press, 1977. [cited at p. 43] Christoph Bussler. B2B Integration. Springer, 2003. [cited at p. 20] David Chappell. Enterprise Service Bus. O’Reilly Media, Inc., 2004. [cited at p. 53,
59, 60, 61, 62]

Ralph Johnson John Vlissides Erich Gamma, Richard Helm. Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley Professional, 1995.
[cited at p. 43]

[5] [6] [7]

Thomas Erl. Service-Oriented Architecture (SOA): Concepts, Technology, and Design. Prentice Hall PTR, 2005. [cited at p. 21] Martin Fowler. Patterns of Enterprise Application Architecture. Addison-Wesley Professional, 2002. [cited at p. 11] Bobby Woolf Gregor Hohpe. Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions. Addison-Wesley Professional, 2003.
[cited at p. 7, 17, 23, 24, 26, 27, 30, 32, 33, 35, 36, 38, 39, 44, 45, 46, 47, 48, 49, 50]

[8] [9]

William Grosso. Java RMI. O’Reilly Media, Inc., 2001. [cited at p. 11, 26] Stany Blanvalet Jeremy Bolie, Michael Cardella and Matjaz Juric. BPEL Cookbook: Best Practices for SOA-based integration and composite applications development. Packt Publishing, 2006. [cited at p. 52, 57]

[10] Ann Wollrath Sam Kendall Jim Waldo, Geoff Wyant. A Note on Distributed Computing. Sun Microsystems Laboratories, Inc., 1994. [cited at p. 27] [11] Doug Kaye. Loosely Coupled: The Missing Pieces of Web Services. RDS Press, 2003. [cited at p. 9] [12] Susan Bishop Alan Hopkins Sven Milinski Chris Nott Rick Robinson Jonathan Adams Paul Verschueren Martin Keen, Amit Acharya. Patterns: Implementing an SOA Using an Enterprise Service Bus. IBM Corp., 2004. [cited at p. 55] [13] Ken Vollmer Mike Gilpin. The Forrester Wave: Enterprise Service Bus, Q4 2005. Forrester Research, Inc., 2005. [cited at p. 75]

87

88

BIBLIOGRAPHY

[14] Roy W. Schulte. Predicts 2003: Enterprise Service Buses Emerge. Gartner, Inc., 2002. [cited at p. 51] [15] Kim Williams Scott McLean, James Naftel. Microsoft .NET Remoting. Microsoft Press, 2002. [cited at p. 11] [16] Venky Shankararaman Wing Lam. Enterprise Architecture and Integration: Methods, Implementation and Technologies. IGI Global, 2007. [cited at p. 7]

Appendices

89

List of Figures

2.1 2.2 2.3 2.4 3.1 3.2 3.3 3.4 3.5 3.6 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 4.13 5.1 5.2 5.3 5.4 5.5 6.1 6.2 6.3

File Transfer style . . . . . . . . . . Shared Database style . . . . . . . Remote Procedure Invocation style Messaging style . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

23 24 26 27 30 32 33 35 38 39 44 44 45 45 46 46 46 47 48 48 49 49 50 53 54 59 59 61 67 68 70

Design Pattern Message . . . . . . . . Design Pattern Channel . . . . . . . . Design Pattern Router . . . . . . . . . Design Pattern Translator . . . . . . . Design Pattern Endpoint . . . . . . . . Overview of a communication based on Design Design Design Design Design Design Design Design Design Design Design Design Design Pattern Pattern Pattern Pattern Pattern Pattern Pattern Pattern Pattern Pattern Pattern Pattern Pattern Content-Based Router Message Filter . . . . . Dynamic Router . . . . Recipients List . . . . . Splitter . . . . . . . . . Aggregator . . . . . . . Routing Slip . . . . . . Process Manager . . . Message Broker . . . . Envelope Wrapper . . . Content Enricher . . . Content Filter . . . . . Normalizer . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . message design patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Overview of a communication based on RPC and MOM . . . Tightly coupled system architecture compared to ESB . . . . Conceptual overview of an Enterprise Service Bus components ESB Design Pattern VETO . . . . . . . . . . . . . . . . . . . Comparison of a typical transformer and Two-step XRef . . .

Case study systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . Case study processing sequence . . . . . . . . . . . . . . . . . . . . . Case study: Message Channel scenario . . . . . . . . . . . . . . . . . 91

92 6.4 6.5 7.1 7.2 7.3 7.4 7.5 7.6

LIST OF FIGURES

Case study: Message Channel with Router and Translator scenario . Case study: Enterprise Service Bus scenario . . . . . . . . . . . . . . Integration architecture layers . . . pESB architecture . . . . . . . . . pESB communication architecture . pESB processing sequence . . . . . pESB configuration model . . . . . pESB configuration example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

71 72 75 76 77 78 80 82

List of Tables

2.1 3.1

Trade-offs between synchronous and asynchronous model . . . . . . . Levels of data transformation . . . . . . . . . . . . . . . . . . . . . .

22 37

93

Index
Aggregator, 45 asynchronous communication, 20 big-endian system, 13 command message, 30 Content Enricher, 48 Content Filter, 49 Content-Based Router, 35, 44 Context-Based Router, 35 COTS, 7 data timeliness issue, 20 Datatype Channels, 33 Dead Letter Channel, 33 Dead Message Queue, 33 design pattern, 43 document message, 30 Dynamic Router, 44 Enterprise Service Bus, 51 Envelope Wrapper, 48 event message, 30 file transfer integration style, 22 Fixed Router, 34 Forward Cache Integration pattern, 62 integration style: Remote Procedure Invocation, 26 Invalid Message Channel, 33 legacy application, 7 loose coupling, 9 message body, 30 Message Broker, 47 message channel, 32 Message Endpoint, 38 Message Filter, 44 message header, 30 Message Oriented Middleware, 52 message properties, 30 Message Router, 34 message sequence, 30 Message Transformer, 36 Message Translator, 36 messaging integration style, 27 Normalizer, 49 Point-to-Point Channel, 33 Process Manager, 47 Publish-Subscribe Channel, 33 Recipients List, 45 response-reply message, 30 Routing Slip, 46 shared database integration style, 24 single point of failure, 25 small-endian system, 13 Splitter, 45 synchronous communication, 20 tight coupling, 10 Transaction Management, 56 Two-step XRef pattern, 61 VETO pattern, 59 VETOR pattern, 60

95

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master Your Semester with a Special Offer from Scribd & The New York Times

Cancel anytime.