Integrating Applications with the Cloud on Windows Azure
Foreword(s) Preface Chapter 1: "Introduction" Chapter 2: "The Challenges of Cloud Integration" Chapter 3: "The Trey Research Scenario" Chapter 4: "Deploying Functionality and Data to the Cloud" Chapter 5: "Authenticating Users and Authorizing Requests" Chapter 6: "Implementing Cross-boundary Communication" Chapter 7: "Implementing Business Logic and Message Routing" Chapter 8: "Replicating, Distributing, and Synchronizing Data" Chapter 9: "Maximizing Scalability, Availability, and Performance" Chapter 10: "Monitoring and Managing Hybrid Applications"

This is a preview of a forthcoming guide that is still under development. We would appreciate your feedback and comments, including technical feedback and suggestions on how we might improve the guide. Please note that some sections of the guide and not yet complete. Feedback and comments can be sent to

Modern computing frameworks and technologies such as the Microsoft .NET Framework, ASP.NET, Windows Communication Foundation, and Windows Identity Framework make building enterprise applications much easier than ever before. In addition, the opportunity to build applications that you deploy to, and run on a cloud platform such as Windows Azure can reduce up-front infrastructure costs, and ongoing management and maintenance requirements. However, most applications are not simple; they may consist of many separate features that are implemented as services, components, third-party plug-ins, and other systems or resources. Integrating these separate features when all of the components are hosted locally in your data center is not a trivial task, and it can become even more of a challenge when you move your application to a cloud-based runtime environment. For example, a typical application may use web and worker roles running in Windows Azure, store its data in SQL Azure, and connect to third-party services for tasks such as authenticating users and arranging deliveries. However, it is not uncommon for an application to also make use of services exposed by partner organizations, or services and components that reside inside the corporate network and, for a variety of reasons, cannot be migrated to the cloud. Applications such as this are often referred to as hybrid applications, and the issues you encounter when building them, or when migrating parts of existing on-premises applications to the cloud, prompt questions such as “How can I integrate the various parts across network boundaries and domains so that all of the parts can work together to implement the complete application?” and “How do I maximize performance and availability when some parts of the application are located in the cloud?” This guide, “Integrating Applications with the Cloud on Windows Azure”, focuses on the common issues you will encounter when building applications that run partly in the cloud and partly on-premises, or when you decide to migrate all or parts of an existing on-premises application to the cloud. It concentrates on Windows Azure and SQL Azure, and shows how you can take advantage of the many features of the platform to simplify and speed up development of these kinds of applications. Windows Azure contains a set of infrastructure components called Windows Azure .NET Libraries, which implement a range of services specifically designed to help you build hybrid applications. These services, such as Service Bus, Caching, Traffic Manager, and Azure Connect, are the main topic of this guide. The guide demonstrates the scenarios where these services are useful, and shows how you can apply them in your own applications. This guide is based on the experiences of a fictitious corporation named Trey Research who evolved their existing on-premises application to take advantage of Windows Azure. The guide does not cover the individual migration tasks, but instead focuses on the way that Trey Research uses the services exposed by Windows Azure .NET Libraries, and other Windows Azure and SQL Azure services, to manage interoperability, process control, performance, management, data synchronization, and security.

Who This Book Is For
This book is the third volume in a series Windows Azure. Volume 1, Moving Applications to the Cloud on Windows Azure, provides an introduction to Windows Azure, discusses the cost model and application life cycle management for cloud-based applications, and describes how to migrate an existing ASP.NET application to the cloud. Volume 2, Developing Applications for the Cloud on Windows Azure, discusses the design considerations and implementation details of applications that are designed from the beginning to run in the cloud. It also extends many of the areas covered in Volume 1 to provide information about more advanced techniques that you can apply in Windows Azure applications. This third volume in the series of guides demonstrates how you can use the powerful infrastructure services that are part of Windows Azure to simplify development effort; integrate the component parts and services of a hybrid application across cloud, on-premises, and third-party locations; and maximize security, performance scalability, and availability. This guide is intended for any architect, developer, or information technology (IT) professional who designs, builds, or operates applications and services that run on or interact with the cloud. Although applications do not need to be based on the Microsoft® Windows® operating system to work in Windows Azure, this book is written for people who work with Windows-based systems. You should be familiar with the Microsoft .NET Framework, Microsoft Visual Studio® development system, ASP.NET MVC, and Microsoft Visual C#® development language.

Why This Book Is Pertinent Now
Software designers, developers, project managers, and administrators are increasingly recognizing the benefits of locating IT services in the cloud to reduce infrastructure and ongoing data center runtime costs, maximize availability, simplify management, and take advantage of a predictable pricing model. However, it is common for an application to contain some components or features that cannot be located in the cloud, such as third-party services or sensitive data that must be maintained onsite under specialist control. Applications such as this require additional design and development effort to manage the complexities of communication and integration between components and services. To prevent this from impeding moving to cloud hosting, Windows Azure is adding a range of framework services that help to integrate the cloud and on-premises application components and services. This guide explains how these services can be applied to typical scenarios, and how to use them in applications you are building or migrating now.

How This Book Is Structured
This is the road map of the guide.

Chapter 1,”Introduction” provides an introduction to hybrid applications, followed by an overview of Windows Azure with links that will help you to find more information about the technologies and development methodologies for building cloud-based applications on Windows Azure. Chapter 2, “The Challenges of Cloud Integration” provides background information on integrating applications with the cloud where some parts of the application may be hosted in the cloud while others remain on-premises in your own datacenter, or exposed by partner organizations. The chapter discusses the challenges that may arise in this type of scenario, which the remainder of the guide covers in more detail. Chapter 3, “The Trey Research Scenario” describes a fictitious company named Trey Research that has faced the challenges of integration by moving parts of their online sales application to the cloud. The techniques that Trey Research adopted are described in more detail throughout the remainder of the guide. Chapter 4, “Deploying Functionality and Data to the Cloud” describes how you can evolve your application by moving certain parts to the cloud to take advantage of the elasticity, scalability, reliability, and cost savings that cloud platforms can provide. Chapter 5, “Authenticating Users and Authorizing Requests” is a brief chapter that discusses the challenges of authenticating users in the cloud. It includes references to other guides in this series that cover claimsbased authentication and authorization id depth. Chapter 6, “Implementing Cross-boundary Communication” describes how you can securely access services and applications across the on-premises/cloud boundary without needing to open firewall ports or configure address translation, and how you can send messages across the on-premises/cloud boundary using a mechanism that is reliable and provides opportunities for load leveling and disconnected scenarios. Chapter 7, “Implementing Business Logic and Message Routing” describes how you can implement logic that directs messages to the appropriate recipients, and decouple subscribers in applications that are located

partly on premises and partly in the cloud. You should read Chapter 6 before moving on to this one as it contains information that this chapter assumes you are familiar with. Chapter 8, “Replicating, Distributing, and Synchronizing Data” describes how you can synchronize data across instances and across the on-premises/cloud boundary when you can take advantage of cloud-hosted data services such as SQL Azure, and still be able to extract useful reporting and management information from your data. Chapter 9, “Maximizing Scalability, Availability, and Performance” describes how you can maximize performance in applications that are located partly on premises and partly in the cloud by providing multiple instances closer to customers, and by caching data that must cross the on-premises/cloud boundary. Chapter 10, “Monitoring and Managing Hybrid Applications” describes some of the techniques you can apply for monitoring and managing your hybrid application, such as capturing diagnostic information and managing configuration remotely. The information in this guide about Windows Azure, SQL Azure, and the services they expose is up to date at the time of writing. However, Windows Azure is constantly evolving and adding new capabilities and features. For the latest information about Windows Azure, see “What's New in Windows Azure” at and the Windows Azure home page at

What You Need to Use the Code
These are the system requirements for running the scenarios:            Microsoft Windows 7 with Service Pack 1, or Windows Server 2008 R2 with Service Pack 1 (32 bit or 64 bit editions) Microsoft Visual Studio 2010 Ultimate, Premium, or Professional edition with Service Pack 1 installed Windows Azure SDK for .NET Microsoft SQL Server or SQL Server Express 2008 Microsoft Internet Information Server (IIS) 7.0 Microsoft .NET Framework version 4.0 Microsoft ASP.NET MVC Framework version 3 Windows Identity Foundation Microsoft Enterprise Library 5.0 (required assemblies are included in the source code download) Windows Azure Cmdlets (install the Windows Azure Cmdlets as a PowerShell Snap-in, this is required for scripts that use the Azure Management API) Sample database (scripts are included in the Database folder of the source code)

The sample code contains a dependency checker utility you can use to check for prerequisites and install any that are required. The dependency checker will install the sample databases.

The following table lists these experts. I want to make sure our cloud apps perform well. Her perspective is both practical and strategic. and are secure. Bharath is a cloud specialist. the developers. He's focused on the task at hand. He checks that a cloud-based solution will work for a company and provide tangible benefits. look for notes provided by the specialists whose interests align with yours. the users. “For the most part. a software developer. He is a cautious person. a software architect. and an IT professional.” Markus is a senior software developer. Poe has a keen interest in practical solutions. there are always special considerations that are very important. Jana is a software architect. and methodical. a lot of what we know about software development can be applied to the cloud. Realizing the benefits that a cloudbased solution can offer to a multi-tenant applications is not always so straight-forward”. the IT organization. are reliable. “Running applications in the cloud that are accessed by thousands of users involves some big challenges. But. after all. for good reasons. detail-oriented. she considers the technical approaches that are needed today and the direction a company needs to consider for the future. . which is building a great cloud-based application. “It's not easy to balance the needs of the company. The panel includes a cloud specialist. He is analytical.” Poe is an IT professional who's an expert in deploying and running applications in the cloud. “Implementing a single-tenant application for the cloud is easy. he's the one who gets paged at 03:00 when there's a problem. and the technical platforms we rely on. He knows that he's the person who's ultimately responsible for the code. The delivery of the sample application can be considered from each of these points of view. The reputation of Tailspin depends on how users perceive the applications running in the cloud.Who's Who This book uses a sample application that illustrates integrating applications with the cloud.” If you have a particular area of interest. She plans the overall structure of an application. A panel of experts comments on the development efforts. In other words.

and many hours of patient effort. We relied on their knowledge and expertise in their respective technology areas to shape this guide. Arun Rajappa. In other words. Kashif Alam. SQL Azure Data Sync. Jorge Rowies (Southworks). Caching. storage and database. There’s no doubt in my mind that industry as a whole is heading for the cloud. Scott Densmore. The goal of this guide is to map Windows Azure features with the specific challenges encountered in the hybrid application scenario. John Sharp (Content Master) and Alex Homer brought to the project both considerable writing skill and expertise in software engineering. ACS. and more. as we write this guide. . Many of the suggestions raised by these reviewers. Hanz Zhang. VM Role. but weren’t actually working on real projects. hybrid applications are those that span the on-premises and cloud divide. It is to address these challenges that my team and I have worked hard to provide you with this guide. By applying their expertise with Windows Azure. Back in January 2010. “Hybrid” is a term that represents the application that positions its architecture somewhere along this continuum.) served as the development and test team. Mark Scurrel. transition to the cloud is not going to happen overnight. most of the people I talked to were interested in the cloud. Alejandro Jezierski (Southworks). we have available many more advanced features that are useful in a variety of scenarios. Two years later. and will continue to evolve at a rapid pace. exceptional passion for technology. I want to thank the following subject matter experts: Clemens Vasters. Most organizations still have a lot of IT assets running in on-premises datacenters. Windows Azure offered only a basic set of features such as compute. We relied on their expertise to validate the scenario as well as shaping the solution architecture. and Corey Sanders. they developed the sample code. This is no longer the case. First and foremost.). but a shift to the next paradigm always takes time. general acceptance and use of cloud computing by organizations has also been evolving. the rate of evolution is accelerating significantly. Matias Woloski. Ravindra Mahendravarman (Infosys Ltd. However. and which bring with them a unique set of challenges that must be addressed. we often needed to clarify and validate our guidelines for using them.Acknowledgements The IT industry has been. Azure Connect. and the insightful feedback they provided. As the technical writers. and describes solutions using the features of Windows Azure that help you to integrate onpremises and the cloud. At the moment we are in the middle of a transition between running everything on-premises and hosting everything in the cloud. I’m often impressed by the amount of knowledge and experience that customers have gained. Tina Stewart. These will eventually be migrated to the cloud. and Ravindran Paramasivam (Infosys Ltd. Traffic Manager. when we started work on the first guide in this series. The following people were also instrumental in providing technical expertise during the development of this guide: Steve Marx. In 2010. Meanwhile. Windows Azure now offers a number of advanced features such as Service Bus. have been incorporated into this guide. As we worked with the Windows Azure integration features. I also want to extend my thanks to the project team. Jason Chen. We were very fortunate to have the full support of product groups and other divisions within Microsoft. and with the advent of the cloud computing. Eugenio Pace and Enrique Saggese. Vijaya Alaparthi. Our guide uses a case study of a fictitious organization to explain the challenges that you may encounter in a hybrid application.

Many thanks also go out to the community at our CodePlex website. I’m always grateful for the feedback we receive from this very diverse group of readers.I also want to thank RoAnn Corbisier and Tina Burden for helping us publish this guide. January 2012 . Masashi Narumoto Senior Program Manager – patterns & practices Microsoft Corporation Redmond.

This is a comprehensive package of cloud-based services that includes management and development tools to make it easier to build all kinds of integrated and hybrid applications. and easy to manage and monitor. secure so that you have full control over access by customers and by administrators. . especially when there are firewalls and routers between them. However. and has no on-premises components. or in hosted in the cloud. hosted by partner organizations.NET Libraries are useful for both integrating onpremises applications with the cloud. Hybrid applications make use of resources and services that are located in different physical or virtual locations. third-party remote services built using Windows or non-Windows technologies. Integrating and communicating between these resources and services is not a trivial task. Often these types of applications must be able to implement the appropriate process control and workflow capabilities so that they can make appropriate use of the distributed local and remote resources and services. A hybrid application is one that uses a range of components. The services exposed by the Windows Azure . or trust boundaries. and some other useful frameworks and components. network. we will be focusing on applications that do have components running in Windows Azure. resources. You can also use many of the services exposed by the Windows Azure . Windows Azure provides a range of services as part of the Windows Azure . Some of these components. and services that may be separated across datacenter. in this guide. About Cloud and On-Premises Integration Applications that run across the cloud and on-premises boundary may use web and worker roles in one or more Windows Azure data centers. organizational. and services may be hosted in the cloud. robust so that they are available at all times. resources. to help you integrate applications with Windows Azure and to build hybrid applications. and on-premises resources such as databases and web services. such as on-premises. SQL Azure databases in the same or different data centers.NET Libraries. applications should be designed and deployed in such a way as to be scalable to meet varying loads. To help you meet these challenges.1 Introduction This guide focuses on the ways that you can use the services exposed by Windows Azure. though this is not mandatory. and in applications that run entirely in the cloud.NET Libraries when the entire application is located within Windows Azure. In addition.

or CPU cores with other VMs on the same physical host. or one that runs on a mobile device can use storage that is located on the cloud. and aggregation of demand. By combining virtualization technologies. commodity hardware. including some of the features of Microsoft Enterprise Library 5. For example. Virtualization also allows you to have both vertical scalability and horizontal scalability.NET Libraries to build comprehensive solutions that work well in these scenarios. network I/O. and SQL Azure. as demand increases. Horizontal scalability means that you can add more instances of VMs that are copies of existing services. as well as providing good practice advice in applying that solution. These generate higher data center utilization (that is. although they might share physical resources such as disk space. savings that are passed along to you. These deployed applications behave as though they were on a dedicated computer. such as CPU cores or memory. Before we look in detail at these scenarios and challenges you may face when designing and building hybrid applications. the Windows Azure . If you are already familiar with developing Windows Azure applications. All these instances are load balanced at the network level so that incoming requests are distributed among them. Virtualizing a service allows it to be moved to any number of physical hosts in the data center. On-premises applications can still use cloud–based resources. At the time of this writing. Windows Azure includes three main components: Windows Azure. Windows Azure abstracts hardware resources through virtualization.Scenarios for this Guide This guide explores a series of scenarios and challenges that are common in all applications that run partly on-premises and partly in the cloud. subsequently. more useful work-per-dollar hardware cost) and. Windows Azure can help you achieve portability and scalability for your applications. The guide also demonstrates some other useful techniques and frameworks. an application located on an on-premises server. It describes how you can apply the services exposed by Windows Azure .NET Libraries. Vertical scalability means that. About Windows Azure Developers can use the cloud to deploy and run applications and to store data. and reduce your running costs and TCO. and provide an overall description of the Trey Research “Orders” application that is used as a focus for this guide. Each application that is deployed to Windows Azure runs on one or more Virtual Machines (VMs). . on a specific VM. A key benefit of an abstraction layer above the physical hardware is portability and scalability. the remaining sections of this chapter provide more background information about Windows Azure. Microsoft can achieve economies of scale. you can increase the number of resources. multi-tenancy. a rich client that runs on a desktop computer.0 Integration Pack for Windows Azure. you may wish to skip the remainder of this chapter and move on to the next chapters that describe the techniques for application integration. This approach allows you to gauge whether a specific service or solution meets the requirements of your own scenario.

there is a comprehensive set of tools and software development kits (SDKs) that allow you to develop. Windows Azure is constantly evolving and adding new capabilities and features. see Windows Azure Tour” at http://www. increase reliability. as well as asynchronous messaging. either through a web-based user interface (a web portal) or and deploy your applications. The billing mechanism for each service depends on the type of features the service provides.Windows Azure provides a Microsoft Windows® Server-based computing environment for applications and persistent storage for both structured and unstructured data. the Windows Azure .microsoft. Finally. The Windows Azure Libraries for . subdivided into the categories of Execution Environment. Windows Azure Services and Features The range of services and features available with Windows Azure. test. In addition. For more information on the billing model. When you subscribe to Windows Azure you can choose which of the features you require. You can add and remove features from your subscription whenever you wish. The information in this guide about Windows Azure. SQL Azure is essentially SQL Server® provided as a service in the cloud. For specific development and usage guidance on each feature or service. SQL Azure. see “Windows Azure Subscription and Billing Model” later in this chapter. In most cases there is a RESTbased API that can be used to define how your services will work. and the associated Hands-on-Labs demonstrate many of the features and services available in Windows Azure and SQL Azure. For example. there are third-party management tools available. Most tools are also integrated into development environments such as Microsoft Visual Studio®. see the resources referenced in the following sections. Windows Azure also includes a range of management services that allow you to control all these resources. and you pay only for the features you use. For more information about all of the Windows Azure services and features.NET Libraries. and the services they expose is up to date at the time of writing. The services and features available change as Windows Azure continues to evolve. you can develop and test your applications in a simulated local environment. and make it easier to manage your cloud-hosted applications. . the accompanying sample Most management tasks that can be performed through the web portal can also be achieved using the API. This series of guides. and other services.windowsazure. see “What's New in Windows Azure” at http://msdn. provided by the Compute Emulator and the Storage Emulator. Data Management. Networking Services. The following four sections of this chapter briefly describe the main services and features available at the time of writing. and implement data management and related features such as caching. Windows Azure includes a range of services that can simplify development. However. For the latest information about Windows and SQL Azure targets specific requirements for your applications. manage authentication.NET provide a range of services that help you to connect users and onpremises applications to cloud-hosted and the Windows Azure home page at http://www.

and Lab 4 in the Hands-on-Labs for that guide.To use any of these features and services you must have a subscription to Windows Azure. see the Microsoft Windows Azure home page at Lab 3 in the Hands-on-Labs for that guide. This role allows you to host your own custom instance of the Windows Server 2008 R2 Enterprise or Windows Server 2008 R2 Standard operating system within a Windows Azure data center. For more information.windowsazure. SQL Azure.aspx and “Building an Application that Runs in a Hosted Service” at For more detailed information see “Creating Applications by Using a VM Role in Windows Azure” at A valid Windows Live ID is required when signing up for a Windows Azure Chapter 5 of the guide “Developing Applications for the Cloud “ and the associated sample application. including Worker roles that are typically used to perform background processing and support tasks for Web roles. A Windows Azure application consists of one or more hosted roles running within the Azure data The types of roles you can implement in Windows Azure are:  Azure Compute (Web and Worker Roles).com/windowsazure/. Typically there will be at least one Web role that is exposed for access by users of the application. and the associated services provide opportunities for storing and managing data in a range of and the associated sample The use of a Worker role is also described and demonstrated in many places throughout the guides and examples.  Most of the examples in this and the associated guides “Moving Applications to the Cloud” and “Developing Applications for the Cloud” (see The application may contain additional roles.aspx. The following data management services and features are available:  Azure use a Web role to perform the required processing. Execution Environment The Windows Azure execution environment provides a platform for applications and services hosted within one or more roles. and the examples in the Hands-on-Labs. Virtual Machine (VM role).codeplex. For more detailed information see “Overview of Creating a Hosted Service for Windows Azure” at http://technet. Some of the features and services listed here were still pre-release or beta versions at the time of writing. see “Windows Azure Storage Services REST API Reference” at Data Management Windows Azure. Chapter 4 of this guide discusses the use of the Windows Azure VM Role. see “Windows Azure Purchase Options” at http://www. Chapter 6 of the guide “Moving Applications to the Cloud” (see http://wag. For up to date information. This includes Chapter 4 of this guide. For information about the REST The four storage services are: .aspx. The services support a REST interface that can be accessed from within Azure-hosted or onpremises (remote) applications. This provides four core services for persistent and durable data storage in the cloud.

Chapters 3 and 5 of the guide “Developing Applications for the Cloud” (see http://wag. This service provides a distributed. For more detailed information see “Windows Azure Caching Service” at http://msdn. persistent messaging between role instances.aspx. and supports queries for managing the data.NET page output◦ The Azure Table Service provides a table-structured storage mechanism based on the familiar rows and columns It provides bi-directional data synchronization and data management capabilities allowing data to be easily shared between multiple SQL Azure databases and between on-premises and SQL Azure distributing. It supports federation capabilities to make multi-tenant data storage and distributed storage easier to implement and The Binary Large Object (BLOB) Service provides a series of containers aimed at storing text or binary data. ◦ ◦ ◦  SQL Azure Database. For more detailed information see “SQL Azure Database” at http://msdn. and Page BLOB containers for random read/write and for ASP. and synchronizing data. For more detailed information see “Queue Service Concepts” at and supports the familiar T-SQL based relational database model. Data and dynamically increases and decreases the cache size automatically as required. including using SQL Azure. in-memory. Chapter 9 discusses the use of Windows Azure Cache service. low latency and high throughput application cache service that requires no installation or management.NET session state information.   Chapter 4 of this guide describes how you can deploy data services to the cloud. For more detailed information see “Microsoft Sync Framework Developer Center” at and with other applications running on-premises or hosted while being easy to access and update. It provides both Block BLOB containers for streaming and “Queue Service REST API” at http://msdn. It is primarily aimed at scenarios where large volumes of data must be stored. and upload and download VHDs via the BLOB. ASP. For more detailed information see “Understanding Block Blobs and Page Blobs” at http://msdn. Windows Azure Drives provide a mechanism for applications to mount a single volume NTFS VHD as a Page BLOB. It can be used with applications hosted in Windows explain the concepts and implementation of multi-tenant architectures for data storage. For more detailed information see “Table Service Concepts” at This is a highly available and scalable cloud database service built on SQL Server technologies. .microsoft.aspx and “Blob Service REST API” at http://msdn. It can be used to cache application data. SQL Azure Data Sync is a cloud-based data synchronization service built on Microsoft Sync Framework technologies. such as between a Web role and a Worker role. and Chapter 8 provides guidance on and “Table Service REST API” at http://msdn. For more detailed information see “Windows Azure Drive” (PDF) at http://go.aspx. The Queue Service provides a mechanism for

This service allows you to configure roles of an application running in Windows Azure and computers on your on-premises network so that they appear to be on the same network. Chapter 6 of the guide “Developing Applications for the Cloud” and the associated sample application. implement Networking Services Windows Azure provides several networking services that you can take advantage of to maximize performance. For more detailed information see “Delivering High-Bandwidth Content with the Windows Azure CDN” at http://msdn. describe the use of Table storage (including data paging) and BLOB storage in multi-tenant applications. and Chapter 2 of the guide “Moving Applications to the Cloud” (see http://wag. and Lab 4 in the Hands-on-Labs for that guide.aspx. These services include the following:  Content Delivery Network (CDN). and debug the roles directly. Chapter 5 of the guide “Moving Applications to the Cloud” and the associated sample application. Chapter 6 of the guide “Developing Applications for the Cloud”. the associated example applications.Chapter 5 of the guide “Developing Applications for the Cloud” and the associated sample application describe the use of Queue storage. ACS acts as a Security Token Service    .com/en-us/library/ee795176. This is a standards-based service for identity and access control that makes use of a range of identity providers (IdPs) that can authenticate users. Traffic use SQL Azure for data storage. For more detailed information see “Windows Azure Traffic Manager” at http://msdn. It uses a software agent running on the on-premises computer to establish an IPsecprotected connection to the Windows Azure roles in the These do not need to be locations where the application is actually running. monitor. Access Control. the associated example application. For more detailed information see “Connecting Local Computers to Windows Azure Roles” at http://msdn. also show how you can use BLOB and Queue storage. The CDN uses a number of data centers at many locations around the world. and provides the capability to administer. and Lab 2 in the Hands-on-Labs for that guide. Lab 5 in the Hands-on-Labs for that Alternative load balancing methods available are Failover and Round Robin. This is a service that allows you to set up request redirection and load balancing based on three different methods. Chapter 6 of the guide “Developing Applications for the Cloud”.com/enus/WAZPlatformTrainingCourse_WindowsAzureTrafficManager. The CDN allows you to cache publicly available static data for applications at strategic locations that are closer (in network delivery terms) to end users. and Labs 3 and 4 in the Hands-on-Labs for that guide. which store the data in BLOB storage that has anonymous Typically you will use Traffic Manager to maximize performance by redirecting requests from users to the instance in the closest data center using the Performance method. also show how you can use Table storage. Chapter 6 of the guide “Moving Applications to the Cloud” and the associated sample application. manage. Azure Connect.codeplex. and improve manageability of your hosted applications. and Lab 4 in the Hands-on-Labs for the guide “Developing Applications for the Cloud” demonstrate use of the Caching service.

(STS), or token issuer, and makes it easier to take advantage of federation authentication techniques where user identity is validated in a realm or domain other than that in which the application resides. An example is controlling user access based on an identity verified by an identity provider such as Windows Live ID or Google. For more detailed information see “Access Control Service 2.0” at and “Claims Based Identity & Access Control Guide” at  Service Bus. This provides a secure messaging and data flow capability for distributed and hybrid applications, such as communication between Windows Azure hosted applications and onpremises applications and services, without requiring complex firewall and security infrastructures. It can use a range of communication and messaging protocols and patterns to provide delivery assurance, reliable messaging; can scale to accommodate varying loads; and can be integrated with on-premises BizTalk Server artifacts. For more detailed information see “Service Bus” at

Chapter 6 of this guide explores connection technologies such as Azure Connect and Service Bus Queues, and chapter 7 describes some of the ways you can use Service Bus Topics and Rules. Chapter 5 briefly discusses the Access Control service, and Chapter 9 discusses Traffic Manager and the Content Delivery Network. Chapter 3 of the guide “Developing Applications for the Cloud” (see and the associated example application demonstrate how you can use the Content Delivery Network (CDN). Detailed guidance on using ACS can be found in the associated guide “Claims Based Identity & Access Control Guide” (see and in Lab 3 in the Hands-on-Labs for that guide. Other Services Windows Azure provides the following additional services:  SQL Azure Reporting (Business Intelligence Reporting). This service allows you to develop and deploy business operational reports generated from data stored in a SQL Azure database to the cloud. It is built upon the same technologies as SQL Server Reporting Services, and lets you uses familiar tools to generate reports. Reports can be easily accessed through the Windows Azure Management Portal, through a web browser, or directly from within your Windows Azure and onpremises applications. For more detailed information see “SQL Azure Reporting” at Marketplace. This is an online facility where developers can share, find, buy, and sell building block components, training, service templates, premium data sets, and finished services and applications needed to build Windows Azure applications. For more detailed information see “Windows Azure Marketplace Data Market” at and “Windows Azure Marketplace” at

Chapter 4 of this guide discusses the SQL Azure Reporting Service and the Business Intelligence features, and combination with data from the Data Market section of Windows Marketplace.

Developing Windows Azure Applications
This section describes the development tools and resources that you will find useful when building applications for Windows Azure and SQL Azure. Typically, in Windows Azure, you will use Visual Studio 2010 with the Windows Azure SDK (which includes the Windows Azure Tools for Microsoft Visual Studio). The tools provide everything you need to create Windows Azure applications, including local compute and storage emulators that run on the development computer. This means that you can write, test, and debug applications before deploying them to the cloud. The tools also include features to help you deploy applications to Windows Azure and manage them after deployment.

You can build and test Windows Azure applications using the compute and storage emulators on your development computer. You can download the Windows Azure Tools for Microsoft Visual Studio, and development tools for other platforms and languages such as iOS, Eclipse, Java, and PHP from For a useful selection of videos, Quick Start examples, and Hands-On-Labs that cover a range of topics to help you get started building Windows Azure applications, see The MSDN “Developing Applications for Windows Azure” section at includes specific examples and guidance for creating hosted services, using the Windows Azure SDK for .NET Tools to package and deploy applications, and a useful Quick Start example. The Windows Azure Training Kit contains hands-on labs to get you quickly started. You can download it at To understand how a Windows Azure role operates, and the execution lifecycle, see “Real World: Startup Lifecycle of a Windows Azure Role” at For a list of useful resources for developing and deploying databases in SQL Azure, see “Development (SQL Azure Database)” at

Managing, Monitoring, and Debugging Windows Azure Applications
This section describes the management tools and resources that you will find useful when deploying, managing, monitoring, and debugging applications in Windows Azure and SQL Azure. All storage and management subsystems in Windows Azure use REST-based interfaces. They are not dependent on any .NET Framework or Microsoft Windows® operating system technology. Any technology that can issue HTTP or HTTPS requests can access Windows Azure's facilities. To learn about the Windows Azure managed and native Library APIs, and the storage services REST API, see “API References for Windows Azure” at The REST-based service management API can be used as an alternative to the Windows Azure web management portal. The API includes features to work with storage accounts, hosted services, certificates,

affinity groups, locations, and subscription information. For more information, see “Windows Azure Service Management REST API Reference” at You can also use the Windows Azure PowerShell Cmdlets to browse, configure, and manage Windows Azure Compute and Storage services directly from PowerShell. For example, you can use these tools to script the deployment and upgrade of Windows Azure applications, change the configuration for a role, and set and manage your diagnostic configuration and diagnostic data. For more information, see “Windows Azure PowerShell Cmdlets” at In addition to these components, Windows Azure provides diagnostics services for activities such as monitoring an application's health. You can use the Windows Azure Management Pack and Operations Manager 2007 R2 to discover Windows Azure applications, get the status of each role instance, collect and monitor performance information, collect and monitor Windows Azure events, and collect and monitor the .NET Framework trace messages from each role instance. You can also use frameworks such as the Enterprise Library Integration Pack for Windows Azure to perform tasks such as automatically scaling deployed instances, collecting logging and auditing information, and implementing reliable connectivity. For more information about the monitoring features of Windows Azure, see “Monitoring Windows Azure Applications” at For more information about the Enterprise Library Integration Pack for Windows Azure, see “Microsoft Enterprise Library” at

Windows Azure includes components that allow you to monitor and debug cloud-hosted services. For information about using the Windows Azure built-in trace objects to configure diagnostics and instrumentation without using Operations Manager, and downloading the results, see “Collecting Logging Data by Using Windows Azure Diagnostics” at For information about debugging Windows Azure applications, see “Troubleshooting and Debugging in Windows Azure” at and “Debugging Applications in Windows Azure” at Chapter 7, “Application Life Cycle Management for Windows Azure Applications” in the guide “Moving Applications to the Cloud” (see contains information about managing Windows Azure applications.

Managing SQL Azure Databases
Applications access SQL Azure databases in exactly the same way as with a locally installed SQL Server using the managed ADO.NET data access classes, native ODBC, PHP, or JDBC data access technologies. SQL Azure databases can be managed through the web portal, SQL Server Management Studio, Visual Studio 2010 database tools, and a range of other tools for activities such as moving and migrating data, as well as command line tools for deployment and administration. A database manager is also available to make it easier to work with SQL Azure databases. For more information see “The Database Manager for SQL Azure” at

SQL Azure supports a management API as well as management through the web portal. For information about the SQL Azure management API see “Management REST API Reference (SQL Azure)” at

Upgrading Windows Azure Applications
After you deploy an application to Windows Azure, you will need to update it as you change the role services in response to new requirements, code improvements, or to fix bugs. You can simply redeploy a service by suspending and then deleting it, and then deploy the new version. However, you can avoid application downtime by performing staged deployments (uploading a new package and swapping it with the existing production version), or by performing an in-place upgrade (uploading a new package and applying it to the running instances of the service). For information about how you can perform service upgrades by uploading a new package and swapping it with the existing production version, see “How to Deploy a Service Upgrade to Production by Swapping VIPs in Windows Azure” at For information about how you can perform in-place upgrades, including details of how services are deployed into upgrade and fault domains and how this affects your upgrade options, see “How to Perform In-Place Upgrades on a Hosted Service in Windows Azure” at If you need only to change the configuration information for a service without deploying new code you can use the web portal or the management API to edit the service configuration file or to upload a new configuration file.

Windows Azure Subscription and Billing Model
This section describes the billing model for Windows Azure and SQL Azure subscriptions and usage. To use Windows Azure you first create a billing account by signing up for Microsoft Online Services at or through the Windows Azure portal at The Microsoft Online Services customer portal manages subscriptions to all Microsoft services. Windows Azure is one of these, but there are others such as Business Productivity Online, Windows Office Live Meeting, and Windows Intune. Every billing account has a single account owner who is identified with a Windows Live® ID. The account owner can create and manage subscriptions, view billing information and usage data, and specify the service administrator for each subscription. A Windows Azure subscription is just one of these subscriptions.

The account owner and the service administrator for a subscription can be (and in many cases should be) different Live IDs. Administrators manage the individual hosted services for a Windows Azure subscription using the Windows Azure portal at A Windows Azure subscription can include one or more of the following:

      

Hosted services, consisting of hosted roles and the instances within each role. Roles and instances may be stopped, in production, or in staging mode. Storage accounts, consisting of Table, BLOB, and Queue storage instances. Content Delivery Network instances. SQL Azure databases. SQL Azure Reporting Services instances. Access Control, Service Bus, and Cache service instances. Azure Connect and Traffic Manager instances.

Figure 1 illustrates the Windows Azure billing configuration for a standard subscription.

Figure 1 Windows Azure billing configuration for a standard subscription

Estimating Your Costs
Windows Azure charges for how you consume services such as compute time, storage, and bandwidth. Compute time charges are calculated by an hourly rate as well as a rate for the instance size. Storage charges are based on the number of gigabytes and the number of transactions. Prices for data transfer vary

according to the region you are in and generally apply to transfers between the Microsoft data centers and your premises, but not on transfers within the same data center. To estimate the likely costs of a Windows Azure subscription, see the following resources:   “Pricing Overview” at Pricing calculator at

You are billed for role resources that are used by a deployed service, even if the roles on those services are not running. If you don't want to get charged for a service, delete the deployments associated with the service.

Chapter 4 of the guide “Moving Applications to the Cloud” (see provides additional information about estimating the costs of hosting applications in Windows Azure.

More Information
There is a great deal of information available about Windows Azure in the form of documentation, training videos, and white papers. Here are some web sites you can visit to learn more:   The website for this series of guides at provides links to online resources, sample code, Hands-on-Labs, feedback, and more. The portal to information about Microsoft Windows Azure is at It has links to white papers, tools such as the Windows Azure SDK for .NET, and many other resources. You can also sign up for a Windows Azure account here. Ryan Dunn and Steve Marx have a series of Channel 9 discussions about Azure at Cloud Cover, located at Find answers to your questions on the Windows Azure Forum at Steve Marx is a Windows Azure technical strategist. His blog is at It is a great source of news and information on Windows Azure. Ryan Dunn is the Windows Azure technical evangelist. His blog is at Eugenio Pace, a principal program manager in the Microsoft patterns & practices group, is creating a series of guides on Windows Azure, to which this documentation belongs. To learn more about the series, see his blog at Masashi Narumoto is a program manager in the Microsoft patterns & practices group, working on guidance for Windows Azure. His blog is at

    

also available online at . writes about developing applications for Windows Azure on his blog at http://scottdensmore. Jim Nakashima is a program manager in the group that builds tools for Windows Azure. It is at    Scott Densmore. Comprehensive guidance and examples on Windows Azure Access Control Service is available in the patterns & practices book “A Guide to Claims–based Identity and Access Control”. lead developer in the Microsoft patterns & practices group. His blog is full of technical details and tips.codeplex. Code and documentation for the patterns & practice Windows Azure Guidance project is available on the CodePlex Windows Azure Guidance site at .codeplex.

which may or may not run in the cloud. are typical candidates for the cloud. providing reliability and global reach. consider the simple example in Figure 1 where a company runs a website that customers use to place orders.2 The Challenges of Cloud Integration It's easy to think of the cloud as somewhere you can put your applications without requiring any infrastructure of your own other than an Internet connection and a hosting account. place orders with suppliers. you can implement a hybrid application by running some parts run in the cloud. in much the same way as you might decide to host your ASP. while a desktop application monitors sales and allows staff to manage the product list and prices. Why the Cloud? The cloud can help to minimize running costs by reducing the need for on-premises infrastructure. so that all of the resources and components can be hosted remotely. and simplifying administration. . or the data is so sensitive that you must apply special security policies. and generate management reports. but these tools run on desktop machines within your own organization. to take advantage of the benefits of the cloud. But what happens if you cannot locate all of the resources for your application in the cloud? It may be that your application accesses data held in your own datacenter where legal or contractual issues limit the physical location of that data. but complex applications may contain parts that are not suitable for deployment to the cloud. Self-contained applications are often easy to locate in the cloud. It could be that your application makes use of services exposed by other organizations. Applications that are self-contained. These orders are fulfilled by several partner companies that hold stock in warehouses and deliver the goods direct to customers. For example. In fact there are many reasons why companies and individuals may find themselves in the situation where some parts of an application are prime targets for cloud hosting. It is often the ideal solution for applications where some form of elasticity or scalability is required.NET or PHP website at a web hosting company. while other parts stubbornly defy all justification for cloud hosting. Perhaps there are vital management tools that integrate with your application. Many companies already do just this. In this situation. while other parts run on-premises or in the datacenters of your existing partners.

minimize client access delays. In addition. order fulfillment is contracted out to partner organizations. and provide failover capability. For example. locating the website in more than one cloud datacenter and using a location-based routing mechanism can provide additional capacity. and disaster recovery. it is also possible to consider other capabilities of cloud hosting that can provide additional benefits. backup.Figure 1 A simple all-on-premises application Business is flourishing. One option is to invest more cash in their datacenter to extend the server farm and database clusters. power and cooling requirements. while maintaining all of the other parts of the application on-premises. the company may need to purchase extra bandwidth for Internet connectivity. it is only the public website (the Orders application) and its associated business logic that needs to scale. However. and the increasing load on the website means that the company must consider how they will cope with this demand. these are not likely to require significant additional processing resources. Once this decision is taken. When estimating the costs of increasing your own datacenter capacity it's easy to forget about the additional implications such as connectivity bandwidth. Meanwhile. and it is up to these organization to manage their scaling and availability issues. minimizing response times. An obvious solution to growing customer demand and increased website traffic would be to locate the website in the cloud. and supporting automatic failover. add power and cooling capacity. as you can see in the figure. . and implement additional resources for backup and disaster recovery. While there will be some additional load on the monitoring and management applications as more customers place orders. Cloud hosting in more than one datacenter allows you to locate your applications closer to the intended users while providing facilities for maximizing performance.

Figure 2 Moving the website to the cloud As shown in the figures. long-running or highly intensive processes such as report generation or data manipulation may be prime targets for cloud hosting. provide connectivity with remote services. However. perform data synchronization and caching.The built-in features of most cloud-hosting environments can also make it easy to implement federated authentication. the website elements of complex applications are often the most likely candidates for cloud deployment because these are the parts that usually must cope with variable demand and require the most bandwidth. However. The cloud offers an elastic runtime environment where you pay only for what you use. and many companies are building applications from scratch with the cloud as the target deployment platform. Cloud hosting is a good way to minimize the costs of providing infrastructure for highly intensive tasks that is only run occasionally. and support business reporting requirements. For example. Figure 2 shows the application after moving the website to the cloud. . a significant number of companies are looking at the cloud as a way of achieving the benefits of cloud hosting for existing applications. Instead of investing in additional capacity in your own datacenter that may be idle for long periods. you can purchase computing time in the cloud as and when required. there are many other opportunities for cloud-hosting other components of your applications where the cloud can offer benefits. Evolution and Integration Cloud hosting is still a relatively recent paradigm. especially if they run only occasionally.

There may also be other factors that make this approach worth considering. For example. Figure 4 shows a typical suite of on-premises corporate applications that communicate with each other to provide a comprehensive business environment. existing corporate applications are complicated.Typically. although these days it is increasingly common for them to be based on a service-oriented architecture where separate applications communicate with each other over local networks. Instead of opening access to the . and recognizing the benefits that accrue. Figure 4 Moving the Shop application to the cloud After moving the Shop application to the cloud. As an example. the company may decide it needs to expose the Travel application to external partners and remote employees. as shown in Figure 4. Figure 3 A corporate suite of on-premises applications The company may make a decision that it will move the Shop application to the cloud to take advantage of the inherent benefits it offers. It is much easier to evolve applications that already use a service-oriented architecture to the cloud. Evolving Applications to the Cloud While monolithic applications offer little opportunity for cloud hosting without considerable additional effort. There may be a monolithic application that was purpose built to run on specific hardware. it is likely that the company will consider other opportunities for cloud-hosting parts of the application suite. Monolithic systems may require considerable refactoring to run as hybrid applications. service-oriented applications are candidates for a cloud-based hybrid approach.

Although the application was not originally created as a cloud-hosted or hybrid application. you don’t want unauthenticated users to be able to access the management features of the cloud-hosted applications. as shown in Figure 5. However. the company may decide to move it to the cloud as well. Organizations building hybrid solutions are most likely to position their architectures somewhere along this continuum. you can see that there are connections between the cloud-hosted applications and the on-premises applications. you can simply call the services across the cloud/on-premises boundary. it is evolving to become one. Figure 5 Moving the Travel application to the cloud This process may continue as the company realizes the many different benefits of cloud hosting by moving other parts of the application suite to the cloud over time. or targets for Denial of Service and other malicious attacks.corporate network for the Travel application. or allow public access to the interfaces that the on-premises applications use to access these applications. One of the most immediate concerns when evolving applications to the cloud is how you will expose internal services and data stores to your cloud-based applications and services. that all of the application parts will move to the cloud. Hybrid applications represent a continuum between running everything on-premises and everything in the cloud. while other parts run on-premises within the corporate network or at the premises of a partner organization. You must also consider how the cloud-hosted applications will access data stored on your corporate domain. Integrating Applications with the Cloud The section “Why the Cloud?” earlier in this chapter illustrates how an application can evolve into a hybrid application where some parts run in the cloud. the on-premises services running on your own internal network that the cloud-hosted applications must access are likely to be behind firewalls and routers. You don’t want these services to be freely available over the Internet. Conceptually. in time. but there are several challenges to achieving it in the real world. It may be. then it seems as though there is no problem. If you look back at Figure 5. and how . In addition. If all of these are just service calls. this evolution seems both simple and obvious. although this is unlikely in complex corporate scenarios where legal or other limitations force some parts to remain on-premises.

and Network level. rather than from trying to integrate disparate applications and components. or send messages to on-premises applications? How will cloud-based applications access data in on-premises data stores? How can you ensure that all instances of the application running in cloud datacenters have data that is up-to-date? In addition. the immediate concerns are centered on issues such as communication and connectivity. authentication. Data. Most EAI strategies are based on layer models that delineate the different challenges based on the functional areas where they occur. the challenges encountered in building hybrid applications and evolving existing applications to the cloud are more likely to arise from separating applications and components that already work together across a new boundary (the cloud/on-premises boundary). The areas of concern typically consist of the following: . processes can interact with all the applications. data is shared across applications. A typical layer model separates the requirements for integration between applications and components at the UI. hybrid application challenges tend to focus on integration of systems that already work together but now have a new boundary that processes must cross. availability. When you move a part of an existing application from on-premises to the cloud. and discover the solutions available that can help you to resolve these issues. and provides mechanisms for integration at each level. It's true that there are several corresponding challenges when attempting to integrate existing applications into a cohesive system. How will cloud-based application call on-premises services. management. Hybrid Application Integration Challenges While some of the issues in hybrid application integration are similar to those espoused by EAI methodologies. However. moving parts of an application to the cloud prompts questions on performance. and the applications can communicate with each other using a protocol and format that all of them understand. When parts of your application are now running in a remote location. EAI tackles the fundamental issues encountered when attempting to make disparate systems work together as a whole. This helps you to identify them more accurately. so that users see a unified interface. In contrast. and security. Enterprise Application Integration You may consider the challenges encountered in hybrid applications to be similar to those addressed by Enterprise Application Integration (EAI) logic processes that include both cloud-hosted and on-premises applications will work across the boundary. Process. While EAI tackles integrating disparate systems. It is possible to divide the many challenges into six separate areas of concern. the layered approach fails to identify the challenge with the appropriate relative significance. can they still work successfully as part of the overall application? It is often helpful to divide the challenges presented by hybrid applications into distinct categories that focus attention on the fundamental areas of concern. and are accessible only over the Internet.

and availability. Authenticating users and authorizing requests. The communication mechanisms must work well over the Internet and compensate for lower bandwidth. It must also protect the contents of messages. authenticate senders. and Open ID. and less reliable connectivity. and applications hosted in Windows Azure. At minimum you will need to modify the configuration. While cloud platforms provide scalability and reliability. an application may need to update a Stock database. Many hybrid applications must process business rules or workflows that contain conditional tests. but increasingly users expect applications to allow them to use more universal credentials. You must also consider how you will deploy data to the cloud. must synchronize and replicate data between locations and across network boundaries. perform auditing operations on the content of the order (such as checking the customer's credit limit). may not be suitable for deploying to Windows Azure Web and Worker Roles. Scalability. and have access to       . higher latency. Facebook. It is likely that you will need to modify the code in your existing on-premises applications to some extent before it and the data it uses can be deployed to the cloud. and may also need to perform translations on the data. Hybrid applications that run partly on-premises and partly in the cloud. and providing instances that are closer to the user to minimize response times. This may involve rules that control the rows and columns that are synchronized. performance. Data synchronization. for a variety of reasons. Cross-boundary communication and service access. Monitoring and management. Traditionally. deploying additional instances of the cloud-based parts of the application to handle varying load and to protect against transient network problems. the division of parts of the application across the cloud/on-premises boundary may cause performance issues. and store the order in another database for accounting purposes. Services calls and messages must be able to pass through firewalls and Network Address Translation (NAT) routers without compromising onpremises security. and you may also need to refactor it so that it runs in the appropriate combination of Windows Azure Web and Worker Roles. Most applications will need to authenticate and authorize visitors or partners at some stage of the process. Deploying functionality and data to the cloud. authentication was carried out against a local application-specific store of user details. monitor the day-to-day operation of these applications. and protect the services and endpoints from Denial of Service (DoS) attacks. For example. Google. and handle applications that. Many operations performed in hybrid applications must cross the boundary between on-premises applications. and which result in different actions based on the results of the rules. Alternatively. existing accounts with social networks identity providers such as Windows Live. There may be increased requirements for caching data at appropriate locations. or run wholly in the cloud but in more than one datacenter. send the order to the appropriate transport and warehouse partner. These operations may involve services and resources located both in the cloud and on-premises. for example. Companies must be able to effectively manage their remote cloudhosted applications. Business logic and message routing. partner organizations. run in the cloud and use on-premises data. the application may need to authenticate using accounts defined within the corporate domain to allow single sign on or to support federated identity with partners.

see “What's New in Windows Azure” at http://msdn.” which are specifically designed to simplify the integration challenges for hybrid applications that run across the cloud/on-premises boundary. The information in this guide about Windows Azure. SQL and the services they expose is up to date at the time of writing. and components. It also lists some of the Windows Azure technologies that you can take advantage of as part of the overall Windows Azure environment. and administer the applications. Cloud services provide a range of opportunities for Platform as a Service (Paas) and Infrastructure as a Service (IaaS) deployment of applications. These solutions can take advantage of the many services and technologies exposed by Windows Azure. and the Windows Azure home page at http://www. For the latest information about Windows Azure. However. “Introduction. described in Chapter 1. This guide explores these challenges. . and this chapter discussed how these relate to the existing models and methodologies for Enterprise Application Integration (EAI). Companies also need to obtain relevant and timely business information from their applications to ensure that they are meeting requirements such as Service Level Agreements (SLAs) at the moment. You will see these challenges in detail. and the Azure technologies Trey Research used to resolve them. together with a range of built-in features that can help to resolve challenges you may encounter when evolving an existing application to the cloud or when building new hybrid applications that run partially on-premises and partially in the cloud. They must also be able to configure. Windows Azure is constantly evolving and adding new capabilities and features. and provides advice and solutions for applications that run on Windows Azure. and to plan for the future. just as they would if the applications were running in an on-premises datacenter. Summary This chapter introduces you to hybrid applications that take advantage of the benefits available from hosting in the cloud.logging and auditing data. The challenges you are likely to face can be summarized in the following categories:        Deploying functionality to the cloud Authentication Cross-boundary communication and service access Business logic and message routing Data synchronization and reporting Scalability. performance. described and implemented throughout the remainder of this guide. and availability Monitoring and management Most of these challenges are related to integration between applications.

It sells these products over the Internet through its Orders application. Trey Research aims to minimize all non-central activities and concentrate on providing the best online service and environment without being distracted by physical issues such as transport and delivery. The transport partner advises Trey Research when delivery to the customer has been made. there are many issues to consider and challenges to be met. Other on-premises applications are used to manage invoicing. and equipment manufacturers. Trey Research simply needs to advise the transport partner when an order is received into manufacturing. laboratories. at the same time as providing general guidance focused on the specific scenarios in each chapter. and its main business is manufacturing specialist bespoke hardware and electronic components for sale to research organizations. Trey Research has partnered with external companies that provide this function. and more. production planning. using services exposed by Windows Azure. and SQL Azure. However. Trey Research's implementation makes use of a range of facilities for secure communication between on-premises applications. supplier orders. raw materials. . The Orders application is just one of the many applications that Trey Research uses to run its business. For this reason. As an Internet-focused organization. and provide a date for collection from Trey Research's factory. The chapters that follow this one show in more detail the ways that Trey Research tackled the challenges. and to integrate with external partner companies (whose applications may also be running on-premises or in the cloud). this guide is concerned only with the Order application and how it integrates with other on-premises systems such as the main management and monitoring applications. The requirement to integrate on-premises applications with cloud-based services is becoming more common as companies realize the benefits of cloud-based computing. cloud-based services. the Windows Azure .3 The Trey Research Scenario This chapter introduces a fictitious company named Trey Research.NET Libraries. It describes at a high level how Trey Research has adapted their existing on-premises infrastructure to take advantage of the benefits of cloud hosting on Windows Azure. You'll see how Trey Research adapted its application to work seamlessly across on-premises and cloud locations. and partner organizations to provide the comprehensive solution that Trey Research required. As with any company that finds itself in this situation. The Trey Research Company Trey Research is a medium sized organization of 600 employees.

especially connecting cloudbased applications and on-premises applications to perform a variety of tasks. Trey Research built the application using the latest available technologies: Visual Studio 2010. ASP. but this has no impact on Trey Research's own application design and implementation. The application was created as two separate components: the Orders application itself (the website and the associated business logic).0. The Orders Application Trey Research's Orders application allows visitors to place orders for products by accessing a cloud-based application.NET Framework. and means that solutions must be found to a range of issues. with the exception of the partner services for transport and delivery. Trey Research's Strategy Trey Research was an early adopter of Windows Azure and cloud-based computing. Trey Research's management and operations applications and some databases are located on-premises in their own datacenter. ASP. The Original On-premises Orders Application When Trey Research originally created the Orders application it ran entirely within their own datacenter. and the suite of management and reporting applications. some vital functions connected with the application are not located in the cloud. and Microsoft Visual Studio® development system. SQL Server®. The Orders application has evolved over time to take advantage of the benefits of cloud-based deployment in multiple datacenters in different geographical locations. and has confirmed this as the platform for new applications and for extended functionality in existing applications.NET MVC 3. and .NET Framework 4. providing better reporting and business information capabilities. Although they are aware of the need to maintain the quality and availability of existing services to support an already large customer base. They are well placed to exploit new technologies and the business opportunities offered by the cloud. In Trey Research's case. The developers are also familiar with Windows Azure. whilst maintaining some essential services and applications within the on-premises corporate network. In addition. The transport and delivery functions are performed by separate organizations affiliated to Trey Research. This includes planning ahead for issues such as increased demand for their services. the public Orders web application would need to be able to scale to accommodate the expected growth in demand over time. and aim to use any of the available features of Windows Azure that can help to simplify their development tasks. the managers at Trey Research are willing to invest in the development of new services and the modification of existing services to extend their usefulness and to improve the profitability of the company. including the .NET MVC. whereas the management and reporting applications would not need to scale . and handling additional complexity such as adding external partners. This is a common scenario for many organizations. It was designed in this way because Trey Research realized early on that these were two separate functional areas. These organizations may use cloud-hosted services.The developers at Trey Research are knowledgeable about various Microsoft products and technologies.

. The transport partner sends a message back to the Orders application after the delivery is completed so that the Orders database can be updated. Analyzing the order contents is a complex and strictly controlled process accomplished by a legal compliance application from a third party supplier. These requirements include keeping detailed records of the sales of certain electronic components that may be part of Trey Research' products. The Audit Log database holds a range of information including runtime and diagnostic information. Trey Research uses separate applications to monitor the Orders application. it must also ensure that it meets legal requirements for the distribution of certain products. It obtains lists of the products that Trey Research offers from the Products database. It uses ASP. and details of unusual orders such as those over a specific total value. Trey Research proposed to manage this scaling as demand increased by adding additional servers to an on-premises web farm in their datacenter. manage the data it uses. and perform general administrative tasks. but these interactions are not relevant to the topics and scenarios of this guide.NET Forms authentication to identify customers and looks up their details in the Customers database using the unique user ID. the Orders application accesses several databases. and hardware items that could be used in ways not originally intended. This message indicates the anticipated delivery date and packaging information for the order (such as the weight and number of packages). The Orders application sends a message to the appropriate transport partner when an order is received. and it runs on a separate specially configured server. Finally.over time. and stores customer orders in the Orders database. Figure 1 shows the application running on-premises. SQL Server Reporting Services allow managers to obtain business information from the Orders database. Due to the nature of the products Trey Research manufactures. Figure 1 High-level overview of the Trey Research Orders application running on-premises As you can see in Figure 1. particularly for export to other countries and regions. These monitoring and management applications interact with Trey Research's corporate systems for tasks such as invoicing and managing raw materials stock.

a comprehensive cloud-based reporting system. security. These include the ability to deploy it in multiple datacenters in different geographical locations to provide better response times and to maximize availability for customers.How the Orders Application Evolved to the Cloud With the availability of affordable and reliable cloud hosting services. By using Windows Azure Traffic Manager. There are other advantages to cloud hosting in Windows Azure that served to make a strong case for moving the public parts of the Orders application to the cloud. This allows Trey Research to take full advantage of the cloud in terms of reliability. In fact this is exactly the scenario that you saw in the previous chapter when discussing the justification for evolving applications to the cloud. services. Trey Research can ensure that requests to the application are automatically routed to the instance that will provide the best user experience. frameworks. It soon became clear that the management and reporting part of the application (which does not need to scale) and the database containing sensitive auditing information should remain on premises. and the availability of third party components and frameworks to simplify development. Windows Azure provides a range of services that are useful to Trey Research in the Order application. . reduced requirements for on-premises infrastructure. and cope with failed instances by rerouting requests to other instances. the capabilities for data synchronization. In addition. This allows Trey Research to more closely control the aspects of the application that require additional security and which. they felt would be better kept within their own datacenter. However. and features designed and optimized for the cloud simplifies both the design and development of cloud-based applications. availability. They can take advantage of the built-in distributed data caching feature for sessions and transient data used by the public website. lower running costs. Trey Research decided to investigate the possibility of moving the application to Windows Azure. the public section of the application could easily be deployed to the cloud as it was already effectively a separate application. and the capability to scale up and down at short notice to meet peaks in demand. and is the part that will be required to scale over time to meet elastic demand. the mechanism for easily implementing federated authentication. Figure 2 shows a high-level view of the architecture Trey Research implemented for their hybrid application. the connectivity features for secure communication and service access across the cloud/on-premises boundary. for legal and logistical reasons. Taking advantage of available components.

which redirects the visitor to the closest datacenter based on response time and availability. visitors to the Orders website authenticate using a social identity provider such as Windows Live ID or Google. Instead of using ASP. the Orders application works in much the same way as when it ran entirely on-premises. Here is a brief summary of the features you see in Figure 2:   Visitor requests all go to Windows Azure Traffic Manager.Figure 2 High-level overview of the Trey Research Orders application running in the cloud Although Figure 2 may seem complex. Windows Azure Access Control Service . You will see more details about each part of the application in subsequent chapters of this guide.NET Forms authentication.

  This guide does not cover ACS in detail. The Products data is kept up to date by synchronizing it from the master database located in the head office datacenter. the Orders application: ◦ Stores the order details in the Orders database in the local SQL Azure datacenter. How Trey Research Tackled the Integration Challenges The remainder of this guide shows in more detail how the designers and developers at Trey Research evolved the Orders application from entirely on-premises architecture to a hybrid cloud-hosted architecture. and returns a token containing a unique user ID to the Orders application. The transport company chosen depends on the type of product and delivery location. such as orders over a specific total value. When a visitor places an order.codeplex. which is part of this series of guides on Windows Azure. It uses this to look up the customer details in the Customers and Products database running in a local SQL Azure datacenter. When transport partners deliver an order to the customer they send a message to the Orders application (running in the datacenter that originally sent the order advice message) so that it can update the Orders database. the on-premises Reporting application uses the Business Intelligence features of the SQL Azure Reporting service running in the cloud to generate reports from the Orders database. Each of the subsequent chapters of this guide examines the challenges that Trey Research faced in detail.  New customers can register with the Orders application if they have not already done The reports are accessible by specific external users. to the onpremises management and monitoring application. The Orders application displays a list of products stored in the Customers and Products database. It also generates a daily reports that it stores in a SQL Server database located in the head office datacenter.(ACS) manages this process. Sends an advice message to the appropriate transport partner. . Sends any required audit information.   ◦ ◦  The third-party compliance application running in a Virtual Machine Role in the cloud continually validates the orders in the Orders database for conformance with legal restrictions and sets a flag in the database on those that require attention by managers. Customer data is synchronized between all datacenters and the on-premises copy of the data so that customers can access the application in any of the global datacenters Trey Research uses. All orders are synchronized across all Windows Azure datacenters so that the order status information is available to visitors irrespective of the datacenter to which they are routed by Traffic Manager. These reports can be combined with data obtained from the Data Market section of Windows Azure Marketplace to compare the results with global or local trends. such as remote partners and employees. which will store this information in the Audit Log database located in the head office datacenter. ACS is discussed in more depth in “Claims Based Identity & Access Control Guide” (see http://claimsid. To obtain management information.

microsoft. and shows how Trey Research implemented the technologies in its Orders application. Azure Some of the features and services listed here (such as VM Role. see the Microsoft Windows Azure home page at http://www. Queues. explores the advantages and considerations of each one. For up to date information.describes the integration technologies available to support the architectural changes. and Traffic Manager) were still pre-release or beta versions at the time of writing. The services and technologies you will see used throughout the remaining chapters of this guide include the following:           Azure Connect Azure Virtual Machine Role Access Control Service Traffic Manager Azure Cache Azure Data Sync Windows Azure Marketplace (Data Market) Service Bus Relay. and Topics The Windows Azure Management REST API Enterprise Library Integration Pack for Windows Azure .

Figure 3 shows them overlaid onto the architectural diagram you saw earlier in this chapter. Figure 3 Technology map of the Trey Research Orders application running in the cloud .To help you understand how Trey Research uses these technologies.

and synchronizing data Scalability. the following table lists the challenges and technologies used. while maintaining some parts in their on-premises datacenter. distributing. and provided an overview of how Trey Research evolved it from an entirely on-premises application into a hybrid application where some parts run in the cloud. and reduce the prerequisites and the requirements for users to establish extensive Azure accounts. “Deploying Functionality and Data to the Cloud” Technologies Azure Web Role Azure Worker Role Azure VM Role SQL Azure Reporting Service Windows Azure Marketplace (Data Market) Enterprise Library Transient Fault Handling Application Block Authenticating users and authorizing requests in the cloud Cross-boundary communication and service access Chapter 5. “Implementing Crossboundary Communication” Azure Access Control Service Windows Identity Framework Azure Connect Service Bus Queues Service Bus Relay Business logic and message routing Replicating. However. which can be divided into the following separate areas of concern: . “Implementing Business Logic and Message Routing” Chapter 8. to simplify installation and setup. “Authenticating Users and Authorizing Requests” Chapter 6. and Synchronizing Data” Chapter 9. Distributing. Availability. “Replicating.To help you navigate through the chapters and find the information relevant to you. Summary This chapter introduced you to Trey Research's online Orders application. “Monitoring and Managing Hybrid Applications” Management REST API Windows Azure Diagnostics System Center Operations Manager The sample Trey Research application that you can download for this guide implements many of the technologies and techniques described in this guide. and Performance” Service Bus Topics and Rules SQL Azure Data Sync Content Delivery Network Azure Cache Azure Traffic Manager Enterprise Library Auto-scaling Application Block Monitoring and management Chapter 10. and availability Chapter 7. the feature set and some of the implementation details differ from the text of this guide. performance. “Maximizing Scalability. Challenge Deploying functionality and data to the cloud Chapter Chapter 4. This typical scenario of a hybrid cloud application presents several challenges around cloud and on-premises integration.

performance. so that you are familiar with the result. For links to more information about the technologies discussed in this chapter see Chapter 1. and how they could be extended or adapted to suit other situations.       Deploying functionality to the cloud Authentication Cross-boundary communication and service access Business logic and message routing Data synchronization and reporting Scalability. The subsequent chapters of this guide drill down into the challenges in more detail. . and availability Monitoring and management This chapter also explored the final architecture of the Orders application. how Trey Research implemented the solutions. “Introduction” of this guide. and provide a great deal more information about choosing the appropriate technology.

However. as you saw. Of course. and how communication between parts of the application. where some parts are located on-premises and some parts are deployed to the cloud. you can design the architecture to meet the specific requirements of hybrid applications. for a variety of reasons. can provide opportunities for reducing costs while maximizing availability. is affected by your new architecture. At minimum you will need to modify the configuration.4 Deploying Functionality and Data to the Cloud In the previous chapters you have seen how a hybrid application architecture. it describes the following areas that have an impact on hybrid application design and implementation:        Factoring functionality for the cloud Locating data services in the cloud Staged and one-step migration of application services Communicating with cloud-hosted services and partners Replacing functionality to take advantage of the cloud Applications and data services that cannot be easily refactored Deploying applications to multiple data centers Factoring Functionality for the Cloud Windows Azure supports three different types of roles into which you can deploy the functionality of your application: . if you are starting with a new application. and you may also need to refactor your code into Windows Azure Web and Worker Roles. This chapter provides an overview of some typical solutions to these challenges. may not be suitable for deploying to Windows Azure Web and Worker Roles. there are challenges that you must overcome when deploying functionality and data to the cloud. including any interaction with external partners. If you are migrating an existing on-premises application it's likely that you will need to modify the code to some extent before it and the data it uses can be deployed to the cloud. While this guide cannot cover every eventuality or individual scenario. You must also consider how you will handle applications that.

These factors include the following:  Distributed session state and output caching must be handled differently in an Azure Web Role from the typical process on-premises. or use a third-party solution that saves data in Azure Storage instead. Writing new web applications or adapting existing web applications for deployment to Windows Azure is very similar to the way you would do this for local deployment in your own datacenter. and other types of user interface or service that expose connectivity. MVC.NET Forms authentication and authorization. and configuration. existing or new web services written using Windows Communication Foundation (WCF) or ASP. you can upload the data into SQL Azure and access it in the same way as you would an on-premises SQL Server database.NET State Server or the SQL Session State Provider. and Performance. and this role does not expose external connections. see Chapter 9. which can prove less expensive but do not provide the same capabilities or performance as SQL Azure. A Web Role can host web applications. accessing some system logs. There are some factors that require you to adapt existing code or design in a way that differs from the usual deployment to a local on-premises datacenter.NET. See the section “Locating Data Services in the Cloud” later in this chapter for more details.NET Framework technologies can usually be deployed to the cloud with little additional effort. if an existing application uses a SQL database. If you depend on Windows authentication or use Active Directory for authorization (such as setting permissions on website folders and resources) you must rework the application to use ASP. You cannot access the underlying operating system in the same way as you can when running code on a physical machine or in a VM Role. such as session state management. This role allows you to upload a virtual machine image and run it hosted in the cloud. you configure an instance of the Windows Azure Cache Service and configure your application to use this. you may decide to use Azure Storage Tables or Blobs instead. or other . However. or a suitable custom    . Razor. web services. For simplicity.NET Web Services (ASMX) technologies are easy to adapt and deploy to the cloud. Users typically connect to these applications and services over HTTP to the standard ports 80 (HTTP) and 443 (HTTPS) although you can open other ports as required. A Worker Role can host code and processes that run in the background. Availability. A Virtual Machine (VM) Role can host applications or code not suitable for running in a Web or Worker Role. See the section “Applications and Data Services that cannot be Easily Refactored” later in this chapter for more information about VM Roles. there are some factors that differ.” You must decide how you will store and access the data that the application uses.   Deploying Functionality to a Web Role Existing or new websites and web applications written using ASP. This prevents you from using COM components. For more information about state management and caching. However. Instead. Code deployed to it runs continuously. “Maximizing Scalability. Likewise. and interacting with some operating system services. claims-based authentication. You cannot use the ASP. data storage.

and single sign-on.   Deploying Functionality to a Worker Role Windows Azure Worker Roles are designed to act as hosts for background processing tasks. see the Run method in the WorkerRole class of the example code available for this guide. located in the Orders. Many common application scenarios place more load on the UI than the background tasks. By minimizing the total number of Web and Worker role instances running you can minimize cost. Instead you can place the settings in the Azure service configuration file so that changes are applied when you upload a new version of this file. and downloadable files. There is no local SMTP server for sending emails. This means that you can run more instances of the Web Role hosting the UI than the Worker Role hosting the background order processing tasks. Connectivity between Roles. Availability. but be aware that a failure of one task will not automatically restart the Worker Role instance. Also consider whether you should run multiple instances of tasks within the same Worker Role by spawning new threads. Consider using a mechanism that can automatically monitor load and adjust the number of deployed role instances is response to demand as discussed in Chapter 9. It is implemented in the file WorkerRole.config file for a web application are not automatically applied at runtime. “Maximizing Scalability. allowing the UI in the Web Role to continue handling user requests. and Performance.”  Unlike a web application running locally in IIS. Trey Research runs multiple tasks within a Worker Role. You can configure a virtual NTFS volume to use in a similar way to a hard disk drive. and implemented code to ensure that any tasks that fail are restarted automatically.Workers project. and Data Stores To communicate between Web Roles and Worker Roles you can use Azure Storage Queues.cs. This is a useful way to maximize usage of the role. You can maximize the user experience and reduce load on your application by locating images. The recommended design to have the Web Role hand off processing that can be done in the background to a Worker Role. resources. but less constant and heavy load from users placing orders. Services.implementation. This is the typical pattern for Azure applications. Typically you use Worker Roles to run code in parallel with the applications deployed into Web Roles. though you can open port 25 for your Azure deployment and use an external email server. it provides a robust load-leveling mechanism that reduces demand on . To see how this is done. Windows Azure Access Control Service (ACS) and Windows Identity Foundation (WIF) make it easy to implement federated claims-based authentication and authorization. Any long running tasks that can run asynchronously with the UI should be separated into Worker Roles to reduce the impact on the UI. but consider using other approaches such as Azure Blob storage for images.” This separation also has a major advantage in terms of cost and efficiency. resources. “Authenticating Users and Authorizing Requests. For example. changes to the Web. For more details see Chapter 5. and downloadable files on the Windows Azure Content Delivery Network (CDN). a shopping site UI is likely to encounter high traffic and load as users browse products.

Queues can also be used to send information back to the Web Roles as required. Azure BLOBs. or Service Bus Queues.role instances and more quickly frees the web application to handle incoming requests. To maximize availability and performance of all hosted applications. you may decide to store the order data in your data store synchronously using code in the Web Role so that the order ID is immediately available to display to the customer. Figure 1 Using Web and Worker Roles with Azure Storage Queues Alternatively. This means that customers will also be able to see orders they have placed immediately. For example. While neither of these events is likely to occur regularly. Azure Cache instances. or program the task in the Worker Role so that it intermittently scans the data store for new orders to process. the Windows Azure application controller aggressively manages the use of runtime resources such as connections. Keep in mind that in some cases you may need to perform some processing within the Web Role rather than the Worker Role. and only when required. you may decide that your Web Role will temporarily store data for background tasks in Table or Blob storage so that Worker Role processes can access it in any order. both Web and Worker Roles may need to connect to other services or data stores such as Azure Tables. in a shopping website that allows customers to place orders. This means that connections to other services and data stores may occasionally time out and be forcibly disconnected. even if the order processing task is not yet completed. You can then pass the order to the Worker Role background task through a queue for further processing. Figure 1 shows this approach. The usual approach is to use a . It is also possible for a temporary loss of network connectivity to prevent immediate connection to a service or data source located outside the datacenter where the application is hosted. you should design your application so that it can gracefully handle disconnection or failure to connect on the first attempt. Implementing Reliable Connectivity In addition to connecting to Azure Queues. SQL Azure databases. with a variable number of Web Role instance posting data into one or more queues for processing by a variable number of Worker Role instances.

waiting one second before the first retry. you must be aware that starting other asynchronous tasks that are not child processes attached to the parent task may prevent the block from properly detecting success or failure of the main task. If required. Service Bus Queues. Retry four or even specify the complete retry policy directly in code instead of loading it from the configuration file. The block is specifically designed for use with Windows Azure and SQL Azure. and more.  The strategies you choose form part of a retry policy that is defined in the application configuration file. and Windows Azure Cache instances. and four seconds before the fourth retry. However. You can also extend the block by adding error detection strategies for other services or by adding custom retry strategies. and Azure Cache. the back-off delay. then three seconds before the third retry. see “Microsoft Enterprise Library” at http://msdn. Windows Azure Service Bus Queues and Topics. For more information. The block is part of the Enterprise Library Integration Pack for Windows Azure. Typically you will use the block with one of the built-in retry strategies:   Fixed interval strategy. the maximum number of retries.” describes in detail how Trey Research uses the block. The Enterprise Library Transient Fault Handling Application Block that Trey Research uses in its Orders application is an example of such as mechanism. You will see how the Transient Fault Handling Application Block can be used in several chapters of this guide to manage connectivity to SQL Azure databases. you can change the time between retries. which can be useful if the same operation is being called multiple times simultaneously by the client application. You can use the default settings. Exponential back off strategy. Service Bus For example. SQL Azure databases. then two seconds before the second retry. Incremental interval strategy. Chapter 5. The Transient Fault Handling Application Block makes it much easier to include code in your application that will retry a failed operation after a short delay. . depending on the strategy type. This retry strategy also introduces a small amount of random variation into the intervals. waiting two seconds before the first retry. Retry four times at one-second intervals.ready-built retry mechanism. and continue to attempt the operation a certain number of times. then four seconds before the second retry. It can be used to manage connections to Windows Azure Storage. you can define more than one retry policy and specify the one to be used at each connection point in your code. You can use the Transient Fault Handling Application Block to execute other processes asynchronously by using the appropriate overload of the ExecuteAction method. Retry four times. “Authenticating Users and Authorizing Requests. then eight seconds before the third retry. or specify your own settings to modify the strategies. and sixteen seconds before the fourth retry.

Table storage is a cost-effective way of storing data. BLOB storage is ideal for storing unstructured information such as images. and (to some extent) Open Database Connectivity (ODBC). files. and familiar SQL Server tools to administer the Foreign keys and standard query operations such as table joins between tables are supported within a database instance. However. but you should take into account the time and effort required to convert an existing application from a SQL-based approach to use Table storage. you will probably need to redesign your data model and rewrite some of the data access code. Azure Tables store data as collections of entities. and will provide better performance for complex and ad-hoc data querying. such as Entity Framework (EF). Table storage is best suited to structured information. Table storage. T-SQL to access and manipulate the data. SQL Azure resolves this problem. and decide to use Azure Table storage. Table storage is also very flexible. certain concepts (such as server-level controls or physical file management) do not apply in an auto-managed environment such as SQL Azure. SQL Azure If you need a high-performance data service that fully supports SQL-based operations. and it delivers data to the application using the familiar SQL Server Tabular Data Stream (TDS) protocol. This allows you to use the same . but (like other database systems) they are not implemented across different databases. which are similar to rows but each has a primary key and a set of properties. there are some limitations in the way you can use the databases. SQL Azure is implemented by SQL Server instances installed in Microsoft datacenters. For more information about using Azure Table Storage. Table storage is cheaper than using SQL Azure.Data.SqlClient) to connect to the database.aspx#sec6. These properties consist of a name and a series of typed-value pairs. you can use SQL Azure (which is part of the overall Windows Azure application environment) to create databases in the cloud that offer almost all the functionality of a local SQL Server installation. If you are migrating an existing application that uses a SQL database to the cloud.NET data providers (such as System. Table storage is not based on the typical database-style storage mechanism. but it does not support the familiar SQL-based techniques for reading and writing data. and Queues. You must implement foreign key restraints in your application code or queries when working with multiple databases. Transactions are supported at database level using the T-SQL commands. and can be very efficient if you design the table structure correctly to maximize query performance.Locating Data Services in the Cloud Windows Azure provides several built-in features for storing information required by your cloud-based applications. and other resources. However. but you cannot use distributed transaction mechanisms such as the Microsoft Distributed Transaction Coordinator (DTC) across different . SQL Azure is also fully compatible with existing connectivity APIs. such as BLOB storage. Due to the physical separation of data across multiple database instances within the Microsoft datacenters.NET. and code in your application to perform client-side data integration for data extracted from multiple databases. see the section “Storing Business Expense Data in Windows Azure Table Storage” in Chapter 5 of the guide “Moving Applications to the Cloud” at http://msdn.

you can use Azure Connect to connect to a database running on-premises. You must also consider how you will replicate and synchronize data that you maintain on-premises and in the cloud. and relies on network connectivity.databases. You must also consider how you will manage intermittent connection failures due to network issues and other factors when connecting to and using SQL Azure or other databases.” discusses strategies and techniques for working with data in these scenarios and shows how Trey Research implemented a solution for the Orders application. and Synchronizing Data. You must also consider how you will connect to the data store and communicate across the network. If you decide to follow this approach. Visual Studio 2010 database tools. and by using familiar tools such as SQL Server Management Studio. You must implement compensation (such as the equivalent commit and rollback functionality) in your application code or queries when working with multiple databases. In addition. but this requires the connection agent to run on the database server. “Replicating. you cannot control physical file allocation. as well as command line tools for deployment and administration. Chapter 8. and you cannot use managed code (CLR) stored procedures. Distributing. An alternative is locate the data store on premises or in another remote managed location. The section “Applications and Data Services that cannot be Easily Refactored” later in this chapter provide more information about VM Roles. It also means that you must manage the number of instances and data mirroring. the VM Role is stateless. You should consider implementing a robust caching mechanism in the application running in the cloud to maximize performance and reliability. and a range of other tools for activities such as moving and migrating data. Finally. Azure Connect relies on network availability and other services such as DNS. However. The section “Implementing Reliable Connectivity” earlier in this chapter describes how you can use a ready-built solution such as the Enterprise Library Transient Fault Handling Application Block to simplify your code while optimizing connection reliability. at the time of writing. You can manage SQL Azure databases through the Azure web portal. Alternative Approaches for Data Storage SQL Azure automatically provides a robust and reliable mechanism by deploying the data in multiple locations and mirroring the content. For example. you must consider using a robust caching mechanism such as Azure Cache to minimize the impact of network issues. This has several disadvantages that may reduce application efficiency and throughput. you may be able to run these in the cloud using the Windows Azure Virtual Machine (VM) Role. Accessing data held on-premises from a cloud-hosted application is not usually the best approach due to the inherent network latency and reliability of the Internet. It introduces latency into the connection. Using the stateless VM Role makes it difficult to run any database or data store that requires the server to maintain state. . If you need to use a different database mechanism or custom data store implementation.

To overcome these kinds of issues. While this seems to be an attractive option that allows you to confirm operation at each of the intermediate stages. reliability. Staged or partial migration of existing on-premises applications to Windows Azure hybrid applications is not straightforward. and accepts queries. and can require considerable redesign to maintain security and performance when communication channels cross the Internet. it may not be the best approach. Service Bus Relay is likely to be the easiest to implement but. It also requires you to install the Azure Connect software on the on-premises servers. As with Azure Connect. message-based solution using Service Bus will provide better reliability. it relies on continuous connectivity. you may be tempted to consider a staged approach where you move applications and services one at a time to the cloud. You may also need to redesign the authentication and authorization features of parts of the application. Staged and One-Step Migration of Application Services If you are converting an existing application into a hybrid application that takes advantage of the cloud. you may need to redesign . application performance will suffer from the increased latency and transient connection failures that are typical on the Internet. as with Azure Connect. You can use Service Bus to connect to this service from a cloud-hosted application using either the relayed (Service Bus Relay) or brokered (Service Bus Messaging) technology. and locate data and services as close as possible to applications that use them. It also leaves the application open to problems if connectivity should be disrupted. exposes the data. which may violate company security policies.Another approach is to build a service that runs in the same local network as the data store. This means that you should consider pre-caching data in the application running in the cloud to provide optimum performance. Another typical design that you may consider applying when performing a staged migration is to use Service Bus Relay to allow cloud-based applications to access on-premises services that have not yet moved to the cloud. you may consider moving the web applications into cloud-based Windows Azure Web Roles and use a connectivity solution such as Azure Connect to allow the applications to access on-premises database servers. Azure Connect is not an ideal solution for reliable connectivity in this scenario as it depends on constant availability of applications at both ends of the connection and accurate DNS resolution. This introduces latency that will have an impact on the web application responsiveness. you should consider designing the architecture for your final hybrid application implementation to use reliable communication techniques that can cope with latency and transient failures. In many cases. For example. For example. but introduces additional complexity and works best when the process executes asynchronously. if you need to move from on-premises Active Directory authentication to Windows Azure Access Control Service authentication. and responsiveness. you must update your cloud-based applications and services to accommodate this change. A brokered. Azure Connect and Service Bus Relay are ideal solutions for scenarios such as monitoring. and will require some kind of caching solution in the cloud to overcome this. and noncore service access where you do not need to rely on continuous connectivity and low latency. management. Service Bus Relay relies on constant connectivity.

describes the communication and connectivity challenges that arise due to remote deployment. if your existing application is not SOA based you should consider redesigning it. have less bandwidth. In this environment. Microsoft BizTalk. However. perhaps in its entirety. Only if your existing application already follows Service-Oriented Architecture (SOA) principles and uses asynchronous message-based communication is a staged or partial migration likely to be advantageous and successful. Such applications may be built using an SOA approach and exchange data through service calls and messaging. so that it is optimized for hybrid deployment. You may be able to replace your existing message connectivity mechanism with Service Bus brokered messaging without needing to redesign the overall architecture so that the cloud-based parts can maintain an appropriate level of connectivity. you must consider how data interchange will work when the network is likely to be less reliable. However. Alternatively. the following sections in this chapter cover some of the more general topics concerned with communication and interaction between the remotely located parts of your application. Chapter 6. and the technologies available in Windows Azure that can help you to implement a robust and secure design. Message-based Communication Patterns and Techniques SOA designs promote the use of messaging as a means for communicating between applications and services. Alternatively. “Implementing Cross-Boundary Communication”. While the SOA design makes it easier to deploy parts to other locations without requiring complete redesign of the architecture. you can assume that you have sufficient bandwidth available to handle the volume of requests. It's easy to take for granted the availability of fast reliable network connectivity when your applications are confined to your own datacenter. Communicating with Cloud-hosted Services and External Partners Applications running entirely on-premises can take advantage of the reliable and fast network connections that are typically available within your own datacenter. This allows you to review how best to segregate functionality into separate parts that run on-premises and in cloud-hosted roles.the connectivity and communication features of your application from the ground up to suit a hybrid environment. if you are already using a technology such as Microsoft BizTalk for interconnectivity you may be able to use the cloud integration features if provides to extend connectivity to the cloud. if the application is already suitably partitioned. However. and exposes the communication channels outside your network security boundary. The high-level approach is technology-agnostic and may be implemented by using a variety of . the optimal solution is likely to be redesign of just the connectivity features based on services such as Service Bus brokered messaging. or other messaging mechanism. When you move applications to the cloud you must realize that network throughout and latency will have a much greater effect on the performance of your applications. you can no longer make the same assumptions. once you distribute the parts of your application across remote locations and link them together over the Internet.

The following describes the differences between the two messaging types in more detail:  Relayed messaging is a framework for implementing service-oriented solutions. It supports the common WS-* standards and the SOAP. Windows Azure. Many architects have documented their experiences through a selection of well-known patterns that you can adapt and apply to your own systems.103). and does not require an active connection between the sender posting a message and the receiver acting as the destination for the message. both service calls (relayed messaging) and sending messages over a channel (brokered messaging) are considered to be “messaging”. multiple senders and receivers (subscribers). Brokered messaging is a framework for posting messages through Service asynchronous operations.communications mechanisms. and brokered messaging for posting messages from an application through a queue to a disconnected receiver. A comprehensive example that uses BizTalk Server and Service Bus in a Windows Azure application is available on the Windows Azure website. For more information about the two types of Service Bus messaging. There are also toolkits and frameworks available that make implementing communication between components of an application easier. and a range of operations for sending and receiving queued See “Hybrid Reference Implementation Using BizTalk Server. see “Relayed and Brokered Messaging” on MSDN at and REST protocols. In Windows Azure. Service Bus messaging technologies and techniques are described in detail in Chapter 6.aspx. examples for the Microsoft platform are Microsoft Message Queue (MSMQ) and Microsoft BizTalk. The service-oriented style implemented by relayed messaging depends on an active connection between applications sending messages and the services receiving them. It is based on Windows Communication Foundation (WCF) and uses WCF relay bindings to define endpoints. they can run at different “Implementing Cross-Boundary Communication. and messages are buffered pending delivery in the Service Bus.” of this guide. .aspx. these two approaches are referred to as Service Bus Relay and Service Bus Queues. It is based on queuing technologies. including web services and other middleware components. Queues and Topics support asynchronous operations. Service Bus supports relayed and brokered messaging. Windows Azure provides comprehensive messaging capabilities within the Service Bus framework. Service Bus Topics are special implementations of Service Bus Queues. Relayed and Brokered Messaging Within the context of Service Bus. Service Bus and SQL Azure” at http://msdn. it requires both the sender and receiver to be available when the service call is made. although the call can be made asynchronously. you should use relayed messaging for implementing a solution based on connected services. and can be used to implement well-known messaging patterns. Service Bus implements both basic end-to-end queues and a mechanism called Topics that supports simple message routing through rules.  Simply put.

However. Clients can navigate to the base address of the service namespace over HTTP using a web browser to obtain an Atom feed that contains a list of endpoints as a navigable tree. In general.Messaging Endpoint Naming and Registry If you implement Service Bus relayed messaging in your applications.aspx. The code may execute synchronously in these patterns and wait for a reply. In these patterns. Windows Azure Service Bus provides a service registry that can be used to publish the endpoint references within a service namespace. The send operations may be executed asynchronously. New endpoints that you register automatically become visible in this tree. Typically the sender will execute asynchronously as there is no need to wait for an acknowledgement or reply. unless the code cannot continue until a response is received. For more information about the Windows Azure Service Bus naming and metadata registry. The reply may be simply an acknowledgement of receipt. it is possible to provide a callback for the reply and execute the send operation asynchronously if required. it is possible to create almost any message interchange structure.Public. Common Messaging Patterns Messaging patterns are designed to help you understand how you can implement solutions to common communication requirements. the sender expects a reply from the receiver after sending the message. or it may contain data requested by the sender. there are multiple senders or receivers. .com/en-us/library/hh367517. you may want to expose endpoints in a way that makes them discoverable so that other clients and applications can find these endpoints and obtain the metadata required to compose messages or service calls in the required format. To make an endpoint discoverable you simply associate the ServiceRegistrySettings behavior with the WCF endpoint and set its DiscoveryMode property to In these patterns. Where possible you should consider using asynchronous processing to minimize the effects of network latency. In these patterns. Broadcast and Multiple Response patterns. and may or may not return responses. or synchronously if the code must wait for a response. you will typically store the details of the endpoints to which clients will connect in your application configuration files. however. or both. Request-Reply or Request-Response patterns. messaging implementations fall into one of these three common types:  One-way or Fire and Forget patterns. the sender sends a request to a single receiver and does not expect a response or reply. especially if there is no expectation of a response. see “Naming and Registry” at http://msdn. It may comprise a single sender broadcasting a message to multiple receivers.   Based on these three simple patterns. The implementations may comprise multiple senders sending to a single receiver that aggregates data. which may or may not return responses.

Figure 2 Common messaging patterns for distributed applications For more information about common messaging and the ways that you might consider tackling errors that can arise. Implementing Reliable Messaging While messaging solutions such as Service Bus are inherently scalable and reliable. As an example of the common interaction patterns for calling services or sending messages. there is no guarantee that an application that uses messaging will always work as expected. Figure 2 shows some ways that you might implement communication with external partners in your application. The following sections describe some of the common and “Chapter 18: Communication and Messaging” of the Microsoft Application Architecture Guide at http://msdn. and messages that arrive out of order. Your messages may also come under attack. they may be invalid or have become corrupted (often called poison messages). when communicating over the Internet. Your messaging implementation must therefore manage a variety of fault scenarios that they might encounter when posting a message to a queue or when receiving a message from a queue. Finally. and this network is outside the scope of your management domain. see “Messaging Patterns in Service-Oriented Architecture” on MSDN at http://msdn. or just get lost. you have little control over how messages are routed or the reliability of connections as messages traverse the network. The section . You must also consider how you will manage intermittent connection failures due to network issues and other factors when sending messages to queues or receiving messages. be misrouted and appear in an unpredictable order. Even if the messages are sent and received.aspx. it cannot make up for delays caused by highlatency and limited throughput networks.No matter how quickly application code executes. there is always the possibility that the receiving application may fail and not return an expected response. In addition. messaging implementations must be able to gracefully handle duplicated messages. An application connects to Service Bus across the asynchronous messaging is extremely beneficial to application performance.

if it does not receive an acknowledgment within a given time period. “Implementing Cross-Boundary Communication. You can (and should) consider implementing a mechanism in receivers that prevents duplicated messages from being processed. This is the case when using both Service Bus Topics and Service Bus Queues. For example. you cannot always guarantee that this will happen. To achieve this you can use a send/acknowledge protocol between the sender and receiver (similar to the way that WS-ReliableMessaging works). or messages arrive out of sequence. This means that they can manage the case where a message may be received more than once. Service Bus has built-in features that assist in implementing a send/acknowledge protocol. the solution must be able to detect if a message does not arrive or has not been processed by the recipient. “Implementing Business Logic and Message Routing across Boundaries. You can set properties on Queues and Topics that instruct Service Bus to automatically remove any messages posted to the queue within a specified period that have the same message ID as a previous message. which allows the sender to match it to the original message.” and Chapter 7. it tries again by reposting the same message. Chapter 6. The receiver can copy the original message ID into the CorrelationId property of the response message. you may have duplicate messages in the queue. or ensure that processing the same message again will not change the state of the system. The core requirement is that the effect on the application or system is the same irrespective of how many times a message is received. However. To maximize the chance of success. but you must design and build your messaging solutions so that they take advantage of these features.” describe in more detail how you can use Service Bus Queues and Topics. Reliable End-to-End Communication In an architecture that depends on messaging to pass information between parts of the application. The sender posts a message to a Queue or a Topic and. Messaging solutions must be idempotent. Service Bus provides features that help to implement reliable messaging. Managing Repeated and Out of Order Messages Messaging mechanisms based on queues will usually deliver each message only once.“Implementing Reliable Connectivity” earlier in this chapter describes how you can use a ready-built solution such as the Enterprise Library Transient Fault Handling Application Block to simplify your code while optimizing connection reliability. and generate an error after a fixed number of attempts if no reply has been received. The sender can set the ReplyTo property of a message to indicate to the receiver the queue on which the sender is listening and expecting a reply. If you repost messages when the expected replies do not arrive. Variable network latency and other connectivity issues (especially if you have multiple senders and receivers) may cause messages to arrive in a different order from that expected. it is vital that the messaging infrastructure and the code that uses it provide a reliable end-to-end communication mechanism that surfaces errors if messages are lost. . or the order in which the messages arrive. Service Bus enables you to take advantage of the properties of Queues and Topics to prevent duplicate messages from being delivered. and deliver them in the order they were posted to the queue. the sender may wait for an exponentially increasing interval between retries.

this approach can cause an application to block if none of the receivers that subscribe to the queue can successfully process the message. again after fifteen seconds. if you have a queue name is MyQueue. . This ensures that a message remains in the queue should the receiver fail. Retry Logic and Transactional Messaging If a messaging error occurs. One of the options for resolving this issue is to set an appropriate value for the DefaultMessageTimeToLive property of the queue. you can use a deadletter queue. such as the receiver failing. and write code in the receiver that examines this property. However. The alternative approach. buffers messages that arrive out of sequence. If you use the ReceiveAndDelete method you must consider how you will manage messages should this occur. the message will expire and be removed from the queue. Managing Invalid or Poison Messages The default behavior for a Service Bus Queue client is to remove a message from a queue using the PeekLock method. After this period. such as an expected reply not being received. the message is lost. For example. using the ReceiveAndDelete method.If your messaging solution requires messages to be processed in a specific order you must add this functionality yourself. a sender failing when posting a message to a queue. the typical solution is to retry the process a specified number of times. If this method is not invoked after a configurable period of time the receiver is assumed to have failed and the message reappears on the queue. This approach is suitable for handling transient faults that may disappear after a short while. This is a useful way to collect and discover invalid or poison messages that cannot be processed. dead lettered messages are moved into the queue MyQueue/$DeadLetterQueue. means that if something goes wrong after the message has been removed from the queue. you may decide to retry a failed send operation after three seconds. You can also force a message to be moved to the dead letter queue by calling the DeadLetter method in a receiver. Alternatively you can use the MaxDeliveryCount property to specify the maximum number of times a message can be delivered before it is automatically removed from the queue. again after one minute. with either a fixed or an incremental delay between each retry. it is not implemented directly by Service Bus. and messages that expired before they could be delivered. and then give up (and generate an error or dead-letter the message) if all attempts fail. and then processes them in the correct order. A receiver must explicitly call the Complete method to indicate that the message has been successfully retrieved and processed. This may be the case if the message in invalid or has been corrupted during transit. For example. messages that are removed from the queue when they expire are placed in a related sub-queue named $DeadLetterQueue. Dead Letter Queues If you need to monitor your solution for messages that were not delivered correctly. When the EnableDeadLetteringOnMessageExpiration property of a queue is true. or a receiver generating an error when it retrieves a message from a queue. The simplest approach is to add a property to each message that contains an incrementing number.

Figure 3 shows a high-level overview of a solution that Trey Research implemented in their Orders application to ensure that orders placed by customers are stored in the database and the delivery request is successfully sent to a transport partner and the audit log. It is vital that both of these operations succeed. Future versions of Service Bus may include functionality that allows you to achieve transactional behavior and rollback when posting messages to a queue. such as: “Best Practices for Leveraging Windows Azure Service Bus Brokered Messaging API” at If a send or receive operation is related to another process. components. but may be more successful in countering transient faults and when using asynchronous messaging operations. A possible solution is to use a custom implementation that keeps track of the successful operations and can retry any that fail. and documentation are available to help you implement retry logic for Windows Azure messaging. Service Bus does not support use of a distributed transaction coordinator (DTC).com/2011/09/best-practices-leveraging-windows-azure-service-bus-brokeredmessaging-api/ “Transient Fault Handling Framework” at http://code. you should consider how you will handle all parts of the process if the send or receive operation should fail. Frameworks. Unlike Microsoft Message Queuing (MSMQ). You cannot enroll a Service Bus send or receive operation in a DTC transaction.msdn. or where there is a complex inter-relationship between operations. and not just when code fails to send or receive a message. It is likely to be most useful when there are many operations to perform. Service Bus messaging only provides transactional behavior within the Service Bus framework.It is vital to design your messaging code in such a way that it can cope with failures that may arise at any point in the entire messaging . This is more complex. such as adding or updating a row in a database. For example. or that a corresponding recovery action or notification occurs. At the time of writing. when a customer places an order in the Trey Research Orders application it must save the order to a database and send a message to a transport partner and an on-premises audit log database through a Service Bus Topic.

This approach separates the task of saving the order from the tasks required to process the order. or handle multiple messages and perhaps combine them into one message. you might consider doing this in the Web role and then storing the resulting message in the database. and releases the Web Role to handle other requests. At each stage of the overall process. The status detail rows contain a . This response will contain information such as the delivery tracking number. the Web Role populates these tables using a database transaction to ensure that they all succeed. with a resulting loss of separation of responsibility. This approach can improve performance and reduce the risk of failure for Worker Roles. For example. A Worker Role then carries out the tasks required to complete the process of notifying the transport partner and sending any required information to the on-premises audit log. which are carried out by a Worker Role.Figure 3 A custom transactional and retry implementation for the Trey Research Orders application The implementation uses separate tables that store the details of the order and the current status of each order. If there is an error. but an acknowledgement has not yet been received. it notifies the administrator. The Worker Role also listens for a response from a transport partner that indicates the message was received. which passes it to the transport partner and audit log. It also distributes the order processing task across the Worker Roles and Web Roles. If you need to perform complex processing on messages before posting them to a queue. the status details may indicate that an order has been accepted and sent to a transport partner. When a customer places an order. and the customer will be able to see the current status of all of their orders without needing to wait until the order processing tasks have completed. At the same time. but has a corresponding negative impact on the Web Roles. a process in a Worker Role supervises the complete set of tasks by looking in the status tables for any orders where all of the processing stages have not completed. the Worker Role updates the status rows to keep track of progress. which the Worker Role stores in the database. It also means that the Web Role can display information such as the order number to the customer immediately. It sends each message to a Service Bus Topic.

“Implementing Business Logic and Message Routing across Boundaries. For example. However. A full description of how Trey research implemented this reliable retry mechanism in the Orders application can be found in Chapter 7. The supervising process can use this value to check if it is still safe to wait for the task to complete. when a Worker Role retrieves orders for processing from the Orders database it sets the expiry time in the status row for each one to five minutes from the current time. The basic and core messaging patterns are simple and easy to understand. it may be that the partner sends queries or responses to your application that are not immediate replies. If the timeout has been exceeded. but they can be combined to implement any type of complex interaction required. especially if it cannot be executed asynchronously or processes must wait for replies. However. the Trey Research Orders application described in this guide communicates with transport partners that receive notification of orders placed. you must consider how changing from an entirely on-premises architecture to a hybrid architecture will affect the communication between the application and these partners. it is common for partner interaction to include a reply element where the partner sends information back to your application. In other cases your application may be able to generate and send email messages. or you may just need to call a web service exposed by the partner. or place the message in a dead letter queue for human intervention. which may occur. the Service Bus Topic is configured to ignore duplicate items. acknowledge the receipt of this message.value that indicates the time at which the entire process should expire if it has not completed successfully. and then—at a later time when delivery is completed—send a status update message to Trey Research. for example. consider if a simpler approach that performs better and is easier to maintain will suffice. Each transport partner must be able to receive messages from Trey Research. or to send back results. if a reply not being received from a transport partner due to a transient network failure. For example. Alternatively. Typically. such as weekly summaries of information or requests for additional data. collect the goods from Trey Research's premises. In fact the interaction with a partner may encompass any of the messaging or communication patterns described in the section “Message-based Communication Patterns and Techniques” earlier in this chapter. The way that you communicate with partners depends on the type of interaction and information transfer required. a Worker Role will restart the process from the beginning. and deliver them to the customer. . However. so the Worker Role can abandon the process and raise an exception. Complex message interchange can reduce the overall performance of applications. if a specified number of retries has been exceeded.” Communicating with External Partners If your application relies on services exposed by partner organizations or other companies. or perform some other partner-specific one-way “fire and forget” action. and the receiver also contains code to prevent it processing the same order twice. SMS text messages. It may be that your application can simply send information to a partner as a message in a standard format such as XML. This may be an immediate response such as confirming receipt. the status rows will also contain a count of the number of times the process has been attempted for each item. This may generate duplicate messages.

don't assume this is the case. In either case. communication with transport partners was though web service calls. cloud-hosted. Trey Research decided to use Service Bus to send messages between the cloud-hosted application and transport partners. However. For example. if the parts of the application that take part in the communication will move to the cloud. it is a good idea to review how you communicate with partners because the capabilities of Windows Azure Service Bus are useful in all cases.Designing Partner Communication It may be that very few changes are required to your existing implementation of partner interaction. After moving to the cloud. Some of the calls originated from within Trey Research's network. However. . after you subscribe to Windows Azure. you may be able to use the same approach and perhaps the same code in a hybrid version of your application. consider if Windows Azure Service Bus would provide additional benefits such as removing the requirement to open inbound ports on your corporate network. Alternatively. if the parts of the application that take part in the communication will remain on-premises. in Trey Research's fully on-premises application. As well as removing the need to open ports on their corporate network. You can use Service Bus Queues and Topics to implement reliable message transfer between any on-premises. later. or partner components of your application. but others required Trey Research to expose an on-premises web service that partners called to update the delivery status. and then. you are likely to need to review the operations and update them. or if you communicate using web service calls or some other direct protocol. it may require only changes to configuration of endpoints and some minor updates to the code in your application. If your application already communicates with external partners using web service calls or other messaging techniques. Figure 4 shows how the Orders application sends a delivery request to a transport partner and receives a reply. Service Bus provides reliable loadleveling communication that can cope with transient network failures and variable network capacity that may be available at partner locations. you may discover that no changes are required. check that you do have an optimal and efficient solution. Service Bus provides a mechanism for exposing secure endpoints to which both your own application and partner applications can connect to send messages or make web service calls without requiring you to open ports on your own network or configure network address translation. Even if you already use another messaging approach for external partners. and use Service Bus Relay to allow cloud-hosted and partner components to make web service calls into your on-premises services. receives a message that indicates delivery was completed. If you already implement a mechanism such as Microsoft BizTalk or other messaging solution.

for example. receive tracking numbers. . The developers at Trey Research could implement this software in its entirety so that. it is unlikely that Trey Research could impose these kinds of conditions on a national or international delivery partner. However. and query for delivery status. You can design appropriate communication interfaces and interact with these as required. and workers at the transport partner click a button in the application when an order has been delivered. RosettaNet. cXML. it prints the delivery advice notes on a local printer.Figure 4 Communication between the Trey Research Orders application and transport partners Interacting with Partner-specific Services When you design the architecture for a hybrid application. it may be possible to convince them to install on their own system the software required to access Trey Research's Service Bus queues to receive and send messages. See the end of this chapter for links to more information.NET publish information and provide tools to help convert between formats. These companies typically expose web services that Trey Research would call using data formats defined by the transport partner. the only limitations you have on communication between your own services and components are those imposed by the platforms and the technologies you choose. Organizations such as Rosetta. or even provide them with a pre-configured computer to accomplish this. However. If Trey Research only needed to deal with small transport companies. the same is not true when you need to communicate with external partners and other organizations. These services allow companies such as Trey Research to send requests for collection. In these cases you may be able to find a common protocol and format that both parties can accept. Adapters are available for Microsoft BizTalk for many of the well-known protocols and formats. Globally agreed standards exist for many industries and types of information. and perform conversion to that standard format using available tools or frameworks. The Trey Research application embodies a typical scenario for communication with external partners. PIDX. BASDA. such as EDIFACT.

though it is more likely that the partner would use the connector component in their own implementation of an application directly focused on their specific delivery fulfillment process. Trey Research could provide this application as well as the connector component. Trey Research could create a custom mechanism for each transport partner that converts the requirements of the Orders application into the relevant sequence of web service calls.Phoning an international shipping company such as UPS or DHL. This provides the necessary level of decoupling between the Orders application and external partners. When Trey Research needs to change the partners it uses. and has buttons to indicate delivery is complete. and you should design your implementation so that this change can be accommodated as easily as possible without disrupting existing code and systems. the developers at Trey Research created adapters that translate the messages received from and sent to Service Bus queues into the required web service calls. Chapter 6. which is especially important if the interaction sequence is complex or there are many queues to connect to (such as when the application is deployed in multiple datacenters).” and Chapter 7. and the web service calls to the partner go out over the head office network. This approach makes it easy to concentrate the logic required to access Service Bus queues in one place. If you want to use their services. and asking them to install your software on their servers. Where possible transport partners will run the connector component on their own network and systems. When you must interact with external partners. the developers only have to modify or build a new adapter for the partner. Figure 5 shows a schematic view of Trey Research's implementation. and use it within an application that exposes the messages in the required form. but this will cause close coupling between the application and the partners. “Implementing Business Logic and Message Routing across Boundaries. consider how you can decouple this interaction. It uses configuration files that can be updated when the number or location of queue endpoints changes. Trey Research must run the connectivity component and the adapter within its own data center. For example. “Implementing Cross-Boundary Communication. Connectors and Adapters The fundamental disconnect between different types of partner services offers Trey Research two choices over how they could implement integration with such partners. . they may use a Windows Forms application that displays and prints the delivery requests. Trey Research decided that a better alternative was to define a standard approach for connecting to the Service Bus queues. is unlikely to help you achieve a solution. For partners with whom Trey Research must interact using that partner's own standards.” describe in more detail how Trey Research implemented communication with transport partners using Service Bus Topics and Queues. you must follow their standards. and then use the code that implements this connectivity within partner-specific wrappers. It's very likely that the partners you use will change over the lifetime of the application. which makes it difficult to manage changes such as adding or removing partners.

For example. and translate this into messages sent back to the application. Instead. . transport partners typically only allow your application to query their web services. Therefore the adapter may need to perform tasks such as saving state and polling partner services to obtain the required information. The important factor is that the Orders application sees a uniform mechanism that applies irrespective of the actual transport partner communication implementation. Replacing Functionality to take Advantage of the Cloud If you are moving an existing application to the cloud. it is likely that there will be some functionality that cannot be implemented in the same way in Windows Azure and SQL Azure. you may be able to replace this functionality with services and capabilities that are part of Windows Azure. Applications designed from the outset to run in Azure may also require these services and functionality.Figure 5 A connector and adapter mechanism for connecting to external partners The implementation of adapters for partners may not be trivial. and they do not provide event-driven status updates (although this is possible by using WS-Eventing protocols).

Wide-ranging and up to date information is vital in all companies for managing investment. A typical example occurs when using SQL Azure instead of an on-premises installation of SQL Server or other database system. SQL Server Reporting extends these capabilities to data hosted in SQL Azure. After moving the data to SQL Azure. It includes the Business Intelligence Design Studio (BIDS) to make it easy to create reports. You can implement claims-based federated authentication and single sign-on through the Access Control Service (ACS).” If you deploy your application is more than one datacenter. it chose to locate the data generated by the application in SQL Azure. Instead. as RSS feeds.NET Framework. See the section “Deploying Applications to Multiple Data Centers” later in this chapter for more For more information see “Windows Azure Marketplace” at https://datamarket. you will need to look for replacement services offered by Windows Azure. You can also take advantage of Azure services to add new capabilities to applications. An on-premises Reporting Service application uses SQL Azure Reporting to generate a range of customized business intelligence reports in common web and Microsoft Office-ready formats for a range of uses by management applications running inside Trey Research's network domain. If you locate your data in SQL Azure. For more planning resource usage. Windows Azure Marketplace exposes a series of data feeds that you can integrate with business reports to add depth to the information. An Example of Using Windows Azure Services When Trey Research moved the Orders application to the cloud. Typically you will use a reporting solution such as SQL Server Reporting Services to extract management and business information from your on-premises SQL Server. However. Before the move. . see Chapter 5.There are unlikely to be limitations on the code you use to create applications and services because the runtime environment is the . For more information. “Authenticating Users and Authorizing Requests. it made sense to adopt the Business Intelligence capabilities of the SQL Azure Reporting Service. you can use the Business Intelligence capabilities of SQL Azure Reporting. In addition. For example. Windows Azure makes it easy to implement other features for your cloud-based applications. Trey Research used SQL Server Reporting Services to generate business information from the Orders database. running a reporting system on you own on-premises network and connecting to SQL Azure databases is still or exported in a range of popular file formats. see “Business Intelligence” on the Windows Azure and SQL Azure website at http://www. or the equivalent for other database systems. but it is likely to consume large amounts of bandwidth and provide poor performance. in SharePoint. if you need to move beyond using managed code or have other specific requirements that require access to systems and services outside of a Web or Worker role. such as comparison with competitors and market share. and monitoring business performance. This service runs in the cloud and can generate a wide range of reports and charts for viewing in a web browser. you may also find the Windows Azure Traffic Manager Service useful in maximizing availability and minimizing response times for users.

or use components or other services that must be installed on the server. Typical examples of these types of applications and services are those that require a specific non-standard operating system configuration. Additionally. Figure 6 Creating business intelligence reports with SQL Azure Reporting Service For simplicity of see “One-Stop Shop for Premium Data and Applications” at http://datamarket.Trey Research extends the usefulness of the reports by combining the raw data available from SQL Azure Reporting with data streams exposed through Windows Azure Marketplace. For more information about the SQL Azure Reporting Service. or you cannot adapt and recompile for legal reasons.windowsazure. . For more information about the SQL Azure Reporting Service. Figure 6 shows the architecture Trey Research implemented. see “Business Analytics” at http://www. you may have existing third party or internally developed applications for which you do not have access to the source code. Trey Research also decided to expose the reports to specific users who access the on-premises Reporting Service over the Internet through Service Bus Relay. “Implementing Cross-boundary” Applications and Data Services that cannot be Easily Refactored Sometimes you need to run applications and services as part of your hybrid application but you cannot refactor or replace these with either a Windows Azure service or a custom For information about using Service Bus Relay to expose information see Chapter 6. the example application for this guide does not include an implementation of SQL Azure Reporting Service.

and Terminal Server that require the server to maintain state after a failure. You create VM Roles though the Windows Azure web and executes queries to measure compliance on a pre-determined schedule. effort. The source code for this application is confidential and not available. The compliance application communicates with the Orders database using a standard connection string. and bandwidth requirements for creating and deploying entire Windows Server VM images to the datacenter. The compliance application can make calls to the remote service that provide the updated certification data it uses though HTTP and HTTPS. and datacenter management services. and use Remote Desktop to manage and monitor the Small Business Server. power. After deploying a VM Role. You can think about Azure VM Roles as an Infrastructure as a Service (IaaS) mechanism that allows you to run your own server on hosted infrastructure that provides and “Virtual Machine Role” on MSDN at http://msdn. SQL Server. and avoid reliance on Azure Connect for data connectivity. Figure 7 shows this scenario. you can deploy them in the cloud within a Virtual Machine (VM) Role. which prevents applications such as Microsoft Active Directory (on a Domain Controller) from being used. To minimize on-premises network bandwidth requirements. the application also generates compliance reports on a regular basis.In these cases. you can connect to it from an on-premises computer through Azure Connect. and configure the connection string to read its input data from the Orders database located in SQL Azure. An Example of a VM Role Scenario As a manufacturer of technical and electronic equipment. consider the time. see “Windows Azure Virtual Machine Role” on the Windows Azure and SQL Azure website at http://www. install all of the applications and services you require. and a government department specifies the processes it must follow and certifies the operation. You can create a suitable VM role on a local machine and configure it as required. the VM Role does not allow the use of UDP protocols for communication. Trey Research uses a special third-party monitoring application that ensures compliance with government regulations and export restrictions on technical products. However. . and then upload a VM image into the role. and then export it as a virtual machine image. in the same way as code running in a Web Role. There are some limitations for VM Role-hosted servers. However. as long as the applications can run on Windows Server 2008 R2. which is saves in a SQL Server database running on an on-premises server through the same Azure Connect channel used to monitor and manage the server running in the VM Role. For more You then upload this image using the Windows Azure web portal or the Windows Azure Management API. Trey Research decided to deploy this application to the cloud in a VM Role in the same datacenter as the Orders application. In addition. At the time of writing VM Roles are stateless. so are not suitable for running applications such as SharePoint.

Deploying applications and data services to multiple datacenters can provide better performance and resilience where requests are geographically widespread. Deploying to multiple datacenters is an approach that is applicable to all types of cloud-based and onpremises applications. and provides resiliency and backup if a major disaster should affect one location. However. unless all of your visitors and customers are located in a single geographical area. it also raises the following issues that you must consider if you decide to deploy to multiple datacenters:  Directing visitors to the appropriate datacenter. you may consider deploying your application to more than one datacenter so that there are instances located closer to more of your visitors. It has several advantages: it can minimize network latency and transient connection issues. However. You can also choose which datacenter you deploy your application to so that network latency and transient connection issues are minimized.Figure 7 Using a VM Role to host a third-party application Deploying Applications to Multiple Data Centers Windows Azure provides a robust. not just hybrid applications. You can use a DNS server to distribute . and scalable infrastructure and execution environment. Each deployment in a different datacenter will have a different namespace name. irrespective of whether you are building hybrid or non-hybrid applications. it also raises issues that you must be aware of. you can maximize availability and minimize response times. and hence a different URL. By managing the number of instances of each role in your applications. make the application appear more responsive to users. However. reliable.

and will also reduce their resilience to network failures. However. but you must be aware that the unique user identifier you get back from ACS is dependent on the instance of ACS that performed the authentication. You may consider using one. If you use ACS as the identity provider and STS for your applications. Where multiple deployments connect to a single instance of an application or service (such an on-premises service or a partner application) these applications and services must be configurable and able to connect to the endpoints for multiple datacenter deployments. perhaps based on some automated scheduler. the user IDs generated by different instances of ACS are different. and Performance” for more information. as well as monitoring for failed installations and taking response times into account. you must consider how this affects any data you deploy in SQL Azure. “Maximizing Scalability. you may want to minimize runtime costs by adjusting the number of running instances in each datacenter based on traffic demand. but multiple deployments of SQL Azure will increase runtime costs. Typically you will deploy data to SQL Azure in the same datacenter as the application to minimize latency and connection issues. but a better solution is to take advantage of an automated demand-responsive solution such as Enterprise Library Auto-Scaling Application Block. See Chapter 10. Windows Azure Traffic Manager can do this for you automatically. or a number less than the number of application deployments. You may instead decide to use only one set of endpoints that are configured in a single datacenter (or less sets of endpoints than there are datacenters). Providing backup or alternative versions of applications. or to use just a single instance on one datacenter. and Performance” for more information. with a mechanism such as Azure Cache to     . Your authentication code and user store must be able to relate more than one user ID to a user so that authenticating through any of the ACS instances you configure provides the correct result. If your application is most active at specific times of day. If you deploy an application to more than one datacenter. The first of these approaches provides resilience and may also offer better performance.” for more information. Managing Service Bus endpoints and subscriptions. “Monitoring and Managing Hybrid Applications. or write custom code to distribute requests in a specific way. This means that the deployments in different datacenters must use a different set of endpoint identifiers. Like Azure service namespaces. Locating data deployments. See Chapter 9. Availability.requests to these on a round-robin basis. Service Bus namespaces are globally unique. Using claims-based authentication. You may decide to deploy to multiple datacenters. You can use DNS records to perform this type of redirection. see Chapter 9. although ultimately the response times depend on the performance of the identity providers (such as Windows Live ID or Google) that you configure. you must consider whether to configure it in each datacenter. through the delay while the DNS change propagates may cause problems.  Managing deployed instance counts. An alternative is to use Traffic Manager. therefore each deployment will need to use different configurations for queue endpoints and subscriptions. but this will affect the performance of applications running in datacenters that do not have local endpoints. You can do this using code that accesses the Windows Azure Management API. Availability. to protect user privacy. but have only some of these deployments running at any time with others available as backups. or as different versions of the application for use at specific times or when changes or failures in other systems make this necessary. “Maximizing Scalability. The second approach may provide better performance.

so that the full IDs are guaranteed to be different in each datacenter. and more. For example. and Synchronizing Data. or network failure. SQL Azure Reporting configuration. Trey Research decided to deploy the data used by the application to multiple SQL Server installations.   An Example of a Multiple Datacenter Deployment Trey Research chose to build its Orders application so that it can be deployed to multiple datacenters. and even with caching in place may reduce the performance as some data may still be accessed in real time across the Internet. and send delivery confirmations back to the datacenter that originated the notification. deployed the data for the Orders application. ensure that you add a datacenter-specific ID or name. and how it configured SQL Azure Data Sync to . If you create or modify data within the application. or bi-directionally to multiple instances. For example. However. This includes factors such as data store connection strings. You may also need to manage replication to backup copies. In addition to deciding where to locate data.” Application configuration per datacenter. Service Bus endpoints. For example.compensate for the increase in network latency.  Synchronizing data between deployments. you must also decide how and when to replicate or synchronize it between multiple SQL Azure deployments in different datacenters if you decide on that approach. data synchronization settings. test copies. You may need to replicate data from a master on-premises or cloud-hosted database out to all cloud-deployed instances. see Chapter 8. Azure storage URIs. The application deployment in each datacenter must have its own configuration settings. if you insert rows into a local SQL Azure database or send messages to another service. with one set in each datacenter where the application is deployed. when. Partitioning data created by the application. However. For more information. This is achieved by passing the target Service Bus namespace and queue name with the delivery notification so that the code can identify the datacenter to which it must send the delivery confirmation. or use a GUID. some of which are specific to that datacenter. this means that all of the on-premises and partner code must be configurable and able to connect to multiple endpoints. the code that transport partners use to receive delivery instructions must listen on the queues from every datacenter. Trey Research also decided to use multiple sets of Service Bus endpoints. and that data includes unique identifiers. and how it's updated. create these in a way that ensures they are unique across all instances of the application in different datacenters. To maintain performance and availability. synchronizing the Orders data across all SQL Azure deployments means that customers will see the latest data (albeit after the replication interval) even if redirected to a different datacenter due to an infrastructure. one in each datacenter where the application is deployed. “Replicating. and where. keep in mind that this can reduce the resilience of applications running in datacenters that do not have a local SQL Azure installation. application. It mainly depends on the nature of the data. ACS authentication settings. This means that each part must be able to handle connections and interoperability with multiple deployments in different datacenters. or just between clouddeployed instances. Distributing. Figure 8 shows how Trey Research uses multiple sets of Service Bus Queues to communicate with transport partners. You must design the cloud-hosted and on-premises parts of your applications to be easily configurable and with a configuration mechanism that works correctly with a variable number of datacenter deployments. as the number may change over time. or standby copies of the data.

Google. or Facebook accounts that are validated in US datacenters and the efficiency of the authentication process will depend to a large extent on the response time of the authentication providers. Trey Research decided to use only a single ACS instance located in the “US North” datacenter. . Figure 8 Trey Research deployed the Orders application data to multiple datacenters In contrast. irrespective of the datacenter to which they are routed. Trey Research also decided that it needs only a single instance of the third-party compliance application running in the cloud.replicate and synchronize the data across datacenters and the on-premises master copy of the customers and products data. The choice of a single ACS instance simplifies the authentication configuration and the code required to map visitors to their stored account information. to avoid complexity. and the process runs only occasionally and is not time-critical. The application talks directly and asynchronously to the Orders database (it does not take part in any other process such as placing an order). The majority of visitors are likely to authenticate using Windows Live ID. Data synchronization ensures that SQL Azure-deployed instances of the Orders database contain all of the orders so that visitors can see all of their orders. This approach reduces cost and complexity. and avoids the need to manage multiple concurrent updates to the on-premises compliance report files.

” For information about using Traffic Manager see Chapter 9. “Maximizing Scalability. “Implementing Crossboundary Communication. “Replicating.” For information about using Service Bus and Azure Connect see Chapter 6. Distributing.In addition.” For information about synchronizing data across multiple locations see Chapter 8.” . see the following chapters of this guide:     For information about using ACS for authenticating visitors see Chapter 5. and reduces runtime costs. Availability. Figure 9 shows the way that Trey Research deployed services to multiple datacenters. and Synchronizing Data. and Performance. Figure 9 How Trey Research deployed the Orders application to multiple datacenters For more information about the way that Trey Research created and deployed the Orders application. because the Orders database content is replicated across all instances. “Authenticating Users and Authorizing Requests. SQL Azure Reporting needs to connect to only one of the SQL Azure deployments to generate business intelligence reports. This also simplifies deployment and management. and examples of how Trey Research uses these technologies in its Orders such as factoring the functionality for the cloud and implementing solutions for applications and services that cannot easily be moved to or located in the cloud. This raises questions around many areas of the design and implementation. design.Summary This chapter covers many different topics related to both hybrid and non-hybrid applications. and some parts are implemented by or for external partners. some parts run in the “Adapter Message Exchange Patterns” at http://msdn. More Information        “Windows Azure Features” at http://www. and reduce the prerequisites and the requirements for users to establish extensive Azure “Windows Azure Design. Finally. the feature set and some of the implementation details differ from the text of this Scale” at http://www. and show in more detail how they can be addressed in different scenarios by referring to the fictitious Trey Research “Messaging Patterns in Service-Oriented Architecture” at http://msdn. to simplify installation and as it the case when communicating with the services exposed by large corporations. including some concerned with authentication and configuration that may not be immediately obvious. and deployment scenarios related to building applications where some parts run on-premises. In the next chapter. In each chapter you will see guidance on choosing and using the appropriate technologies.aspx “BizTalk Message Exchange Adapters” at http://www. However. this chapter explores the significant issues and challenges related to deploying an application and its data to more than one datacenter. you begin a much deeper exploration of the individual Azure-related technology areas identified so far in this guide. However. it concentrates on architectural. In particular. related to this are challenges centered on communication between parts of the application and external partners.aspx . Most of the topics in this chapter concern deployment challenges. The sample Trey Research application that you can download for this guide implements many of the technologies and techniques described in this guide. you must find ways to make this work successfully when you do not have control of the interfaces. However.aspx “Message Exchange Protocols for Web Services” at http://www. The aim of this chapter is to look deeper at the topics and challenges identified in previous “Relayed and Brokered Messaging” at http://msdn.10).microsoft.

“Rosetta. “Implementing Cross-boundary” For links and information about the technologies discussed in this chapter see Chapter “Business Analytics” (SQL Azure Reporting) at For information about using Service Bus Relay see Chapter 6.     “Chapter 18: Communication and Messaging” of the Microsoft Application Architecture Guide at “One-Stop Shop for Premium Data and Applications” at http://datamarket. “Introduction” of this Global Supply Chain Standards” at http://www. .

their Windows Live ID credentials are able to visit other sites that use Windows Live ID without being prompted to re-enter their credentials. With SSO. federated authentication. but increasingly users expect applications to allow them to use more universal credentials. It does not provide a full reference to claims-based . for example. In this chapter. you will learn more about the Windows technologies that are available to help you implement claims-based authentication. and security for Service Bus queues. Trey Research chose to implement claims-based authentication in its Orders application when it evolved to the cloud. users that sign in to one application by means of. called federated authentication. You will also see how Trey Research implemented these features in the Orders application. for example. This process. Facebook. also offers the opportunity for applications to support single sign-on (SSO). Federated claims-based authentication and single sign-on are powerful techniques that can simplify development as well as providing benefits to users by making it easier to access different applications without requiring them to re-enter their credentials every time. and allow users to choose which they specify to sign in. where a Security Token Service (STS) generates tokens that are stored in cookies in the user's web browser or presented by services when they make a request to a server. single sign-on.5 Authenticating Users and Authorizing Requests Most applications will need to authenticate and authorize visitors or partners at some stage of the process. This chapter describes authentication and authorization challenges. on-premises applications. Trey Research also took advantage of a service provided by Windows Azure that can be used to secure the Service Bus queues that pass messages between the cloud-based applications. Alternatively. Google. and shows how Trey Research tackled these challenges in its Orders application. and Open ID. and partner applications. authentication was carried out against a local application-specific store of user details. or support a combination of federated and corporate credentials. All of these scenarios can be implemented using claims-based authentication. applications may need to authenticate users with accounts defined within the corporate domain. Traditionally. existing accounts with social network identity providers such as Windows Live.

authorization. an application can delegate the responsibility for this task to an external identity provider that handles storing the credentials and checking them when the user signs in. it is useful to allow users to sign in with credentials that they use for other websites and applications so that they do not need to remember another user name and password. and secure. For example. of whom the application has no prior knowledge. Users are not usually expected to register before using their account. responsive. However.authentication technologies and techniques. for public applications. Users register by providing information that the site requires. and not editable by the user. where known users from partner organizations must be able to authenticate. Poorly designed or badly implemented authentication can be a performance bottle-neck in applications. Uses Cases and Challenges Most business applications require that users are authenticated and authorized to perform the operations provided by the application. However. The section “Federated Authentication” later in this chapter provides more details of how this works. an intermediary service that is available . In addition. available. Authenticating Corporate Users and Users from Partner Organizations Applications used by only a limited and known set of users generally have different requirements to public applications. and is exposed to applications though the built-in operating system features and a directory service. A detailed exploration of claims-based authentication. Users expect authentication to be simple and quick. Applications may store the users' credentials and prompt for users to enter these when signing on. This approach is typically used for “internal” or corporate accounts in most organizations. and Windows Azure Access Control Service can be found at “Claims-Based Identity and Access Control Guide” available at The solutions should also be as unobtrusive as possible so that the task of signing in is simple and painless. The solutions you implement must be reliable. but the key factor is to be able to uniquely identify each user so that the correct information can be mapped to them as they sign on next time. They must be able to uniquely and accurately identify users and provide information that the application can use to decide what actions that user can take. it should not get in the way of using the application. such as online shopping sites or forums. as this is now the responsibility of the identity provider. By using federated authentication. typically need to authenticate a large number of users.codeplex. the account will typically define the membership of security groups and the users' corporate email addresses. Authenticating Public Users Publicly available applications. such as a name and address. they expect the organization to already have an account configured for them. The following sections describe the most common generic use cases for authentication and authorization. This approach also removes the responsibility for storing sensitive credentials from your application. the account details will often be more comprehensive than those held by a public application or social identity provider.

After the application receives the token. and the claims that the token must contain. or Manage. permissions. the token the user obtained from their chosen identity provider may contain claims that specify the role or permissions for the user. or sending messages that may not be valid. Claims-based and federated identity solutions allow you to clearly separate the tasks of authentication and authorization. In the case of a public application such as a shopping website the repository will typically hold the information provided by users when they register. the application may need to map this to an existing account to discover the roles applicable to that client. which the application can use to authorize the users' actions. For non-browser clients. If the token contains only a unique identifier. Clients accessing a Service Bus Queue can have one of three permission levels: Send. the accepted formats.outside of the organization is required to generate the claims-based identity tokens. After authentication. . so that changes to one of these do not impact the other. the application must be able to control the actions users can take. and perhaps the name. Authorizing Access to Service Bus Queues Hybrid applications that use Windows Azure Service Bus Queues must be able to authenticate users. or rights information for each user. The client is responsible for obtaining the required token from a suitable identity provider and STS. and presenting this token with the request. client applications. it can use the claims it contains to authorize access. such as other applications that make service requests to your application's web services. allowing you to match users with their stored account details when they sign in. With claims-based authentication. This can be achieved using the same federated authentication techniques as described later in this chapter. you must publish information that allows the client to discover how to obtain the required token. Authorizing User Actions Identifying individual users is only part of the process. Social identity providers such as Windows Live ID and Google typically include only a claim that is the unique user identifier. this is generally only the case when the user was authenticated by a trusted identity provider that is an enterprise directory service. so in this case you must implement a mechanism that matches the user identifier with data held by your application or located in a central user repository. Authorizing Service Access for Non-browser Clients Claims-based identity techniques work well in web browsers because the automatic redirection capabilities of the browser make the process seamless and automatic from the user's point of view. and partner applications that access the Queues. This repository must contain the roles. Listen. However. such as address and payment details. This decoupling makes it easier to maintain and update applications to meet new requirements. and consequently contains role information for each user (such as “Manager” or “Accounting”). It is vital for security reasons that clients connecting to a queue have the appropriate permissions to prevent reading messages that should not be available to them.

and this should be considered as an additional layer of protection. Service Bus integrates with Windows Azure Access Control Service. Access to STSs and identity providers should take place over a secure connection protected by SSL or TLS to counter man-in-the-middle attacks and prevent access to the credentials or authentication tokens while passing over the network. by setting appropriate permissions on the repository tables and content. by definition. The service is likely to be inside the corporate network or a protected boundary. be secure and robust to protect applications from invalid access and illegal operations. client applications.Service Bus Topics and Subscriptions behave in the same way as Queues from a security perspective. and is not open to spoofing or other types of attack that may compromise the accuracy of the result. which acts as the default identity provider for Service Bus Queues and Service Bus Relay endpoints. Cross-Cutting Concerns Authentication and authorization mechanisms must. and partner applications that access Service Bus Relay endpoints. If an in-house repository of user information is maintained. and you authenticate users and authorize access to operations in the same way. so is vital for security reasons that clients accessing a through Service Bus Relay have the appropriate permissions in order to prevent invalid access. . You must be aware of legal and contractual obligations with regard to personally identifiable information (PII) about users that is held in the repository. All sensitive information should be encrypted so that it is of no use if security is compromised. However. Security The identity of a user or client must be established in a way that uniquely identifies the user with a sufficiently high level of confidence. and from internal attack achieved by means of physical access to the servers. However. Authorizing Access to Service Bus Relay Endpoints Hybrid applications that use Windows Azure Service Bus Relay must be able to authenticate users. other requirements are also important when considering how to implement authentication and authorization in Windows Azure hybrid applications. you must be able to trust the identity provider and any intermediate services with the appropriate level of confidence. even when using secure connections. this must be protected from external penetration through unauthorized access over the network. When using federated authentication and delegating responsibility for validating user identity. you can configure Service Bus to use other identity providers if you wish. Many STSs can encrypt the authentication tokens they return. Clients accessing a Service Bus Relay endpoint must present a suitable token unless the endpoint is configured to allow anonymous (unauthenticated) access.

Interoperability The protocols and mechanisms for claims-based authentication are interoperable by default. Keep in mind that the process can involve several network transitions between identity providers and STSs. It covers:    Federated authentication. The implementation you choose must be able to satisfy requests quickly during high-demand periods. frameworks and technologies such as Microsoft ASP. if you store the list of users in a partner organization that can access your application. all of these approaches require the maintenance of a user directory that defines which users can access the application or service. security token services. or use an external identity provider over which you have no control. which is likely to be the most appropriate approach for hybrid applications. However. More recently. users will not be able to access the application.Responsiveness Authentication mechanisms can be a bottle-neck in applications that have many users signing in at the same time. Maintaining such lists is cumbersome and can even be prone to errors or lack of security. this approach is often more difficult with authentication mechanisms that must all access a single repository. Claims-based Authentication and Authorization Technologies This section provides a brief overview of the technologies for authentication and authorization that are typically used in Windows applications and applications hosted in Windows Azure. While it is easy to add more servers to an application to handle additional capacity. Reliability Authentication mechanisms are often a single point of failure. You must ensure that the external providers and services you choose are compatible with requirements and with each other. and identity providers Windows Identity Foundation (WIF) Windows Azure Access Control Service (ACS) Federated Authentication In traditional applications. However. For example. such as the start of a working day. and what these users can do within the application or service. you depend on that partner to tell you when a user leaves the company or when the permissions required by specific users change. whereas others might accept or return other types such as SAML tokens. You still retain control of authorization so you can manage what users can do within the application. . They use standards-based protocols and formats for communication and security tokens. Instead. some STSs and identity providers may only provide one type of token. you can choose to trust the partner and allow that partner to maintain the list of users and their roles. It focuses on claims-based authentication.NET have provided built in mechanisms that implement a similar approach for web-based applications. such as a Simple Web Token (SWT). If the authentication mechanism fails. authentication and authorization are typically operating system features that make use of enterprise directory services such as Microsoft Active Directory (AD) and resource permissions settings such as Access Control Lists (ACLs). and any one of these may be a weak point in the chain of events.

This approach to delegated authentication requires a standard way of querying user repositories and distributing tokens that contain claims. allowing this user to access other applications without needing to re-enter credentials. Facebook. In practice. but you can also define service identities in ACS that allow clients to authenticate without using an external IdP. for smart clients and service applications the token is delivered within the service response (depending on the protocol in use). you decide which STS your application will trust. An STS may also deliver a cookie to the web browser that indicates the user has been authenticated. When using a web browser. The STS then generates an application-specific token. An Overview of the Claims-Based Authentication Process An STS chooses an IdP that will authenticate the user based on the user's home realm or domain. ADFS is an IdP as well as an STS because it can use Active Directory to validate a user based on credentials the user provides. the IdP returns a token containing claims (information) about that user to the STS. and redirects the user to the application with . For users authenticating through a web browser. as shown in Figure 1. and OpenID. This provides a Single Sign-On (SSO) experience. Alternatively. Each STS uses Identity Providers (IdPs) that are responsible for authenticating users. which can contain these claims or some augmented version of the claims.but you are freed from the need to authenticate every user and allocate them to the appropriate authorization roles or groups. ACS uses external IdPs such as Windows Live ID. such as ADFS connected to your corporate Active Directory. and compatible STSs are available for other platforms and operating systems. the STS directs the user to the appropriate IdP. If the user can be successfully authenticated. For example. Security Token Services (STSs) such as Microsoft Active Directory Federation Service (ADFS) and Windows Azure Access Control Service provide this capability. Google. a user has a home realm of Google if this user has an identity that is managed and authenticated by Google. Figure 1 An authentication trust chain that can support federated identity and single sign-on. their home realm will be the Trey Research corporate domain. this token is delivered in a cookie. it may be an external STS such as ACS. You can also configure an STS to trust another STS. If a user's account is in the Trey Research Active Directory. This may be an STS that you own. so that users can be authenticated by any of the STSs in the trust chain. After the identity provider authenticates the user. the STS will create a token containing the claims for this user.

The application can use the claims in this token to authorize user actions. Authorizing Web Service Requests The automatic redirection process described above is only applicable when authenticating requests that come from a web browser. Requests to a web service may be generated by code running in another application. . this STS can generate a suitable token without requiring the user to reauthenticate with the original IdP. the STS can store an additional cookie containing an STS-specific token in the browser to indicate that the user was successfully authenticated.this token. such as a smart client application or service application. the client must actively request the tokens it requires at each stage. outside of a web browser. Figure 2 A simplified view of the browser authentication process using ACS and social Identity providers To support single sign-on. and send these tokens along with the request to the next party in the authentication chain. In an environment such as this. Figure 2 shows a simplified view of the process when using ACS as the token issuer. request redirection and cookies cannot be used to direct the request along the chain of STSs and IdPs. Figure 3 shows a simplified view of the process for a smart client or service application when using ACS as the token issuer. When using claims authentication with a smart client application or a web service. When the user accesses another application that trusts the same STS.

Figure 3 A simplified view of the smart client or service authentication process using ACS and social Identity providers Claims-based authentication for web browsers is known as passive authentication because the browser automatically handles the redirection and presentation of tokens at the appropriate points in the authentication chain of and federated claims-based allowing developers to control the process as required. is referred to as active authentication. WIF also includes controls that can be embedded in the application UI to initiate sign-in and signout functions. or from the command line. Windows Azure Access Control Services.aspx. Active Directory Federation Services. WIF automatically checks requests for the presence of the required claims tokens. . It also exposes the claims in valid tokens to application Authentication for service calls. to configure applications for claims-based identity. and performs redirection for web browsers to the specified STS where these can be obtained. WIF includes a wizard that developers can use in Visual Studio. see the Identity Management home page at http://msdn. where this process must be specifically managed by the application code. Windows Identity Foundation (WIF) is a core part of the Microsoft identity and access management framework based on Active Directory.codeplex. The WIF components expose events as they handle requests. Windows Identity Foundation Microsoft provides a framework that makes it easy to implement claims-based authentication and authorization in web applications and service applications. For more information about Windows Identity Foundation. so that the code can make authorization decisions based on these claims. and the patterns & practices “Claims Based Identity & Access Control Guide” at http://claimsid.

and then present the token that ACS issues to the relying party. and creates a new token containing the transformed claims. the service must first obtain a suitable token from an identity provider. and WS-Trust.0. OpenID. it prevents different applications from identifying users through their ID when they are authenticated by different ACS instances. . The web application can use the claims in this token to apply authorization rules appropriate for this user. ACS also acts as a token issuer for Windows Azure Service Bus. and can issue a SAML 1. and service users and is compatible with popular programming and runtime environments.1. If you deploy multiple instances of ACS in different datacenters or change the namespace of your ACS instance you must implement a mechanism that matches multiple ACS-delivered unique IDs to each user in your user data store. This is done to protect user privacy. application.0. It allows authentication to take place against many popular web and enterprise identity providers. SAML 2. ACS receives a request for authentication from the web application and presents a home realm discovery (HRD) page that lists the identity providers the web application trusts. and Simple Web Token (SWT) formatted tokens. The process for authentication of requests from smart clients and other service applications is different because there is no user interaction. or though the web portal that provides a graphical and interactive administration experience. The rules configured within ACS can perform protocol transition and claims transformation as required by the web application. Instead. ACS accepts SAML 1. or SWT token. ACS then applies the appropriate rules to transform the claims. WS-Federation. Each configured instance of an ACS namespace generates a different user ID for the same user. SAML 2. ACS is configured through the service interface using an OData-based management API. The user selects an identity provider. and ACS redirects the user to that identity provider’s login page. It then redirects the user back to the web application with the ACS token. The user logs in and is returned to ACS with a token containing the claims this user has agreed to share in that particular identity provider. ACS and Unique User IDs One point to be aware of if you decide to use ACS is that the user ID it returns after authenticating a user is unique not only to the user. present this token to ACS for transformation. Service Bus automatically integrates with ACS and uses it to validate access and operations on queues and endpoints. and supports a range of protocols that includes OAuth.Windows Azure Access Control Service Windows Azure Access Control Service (ACS) is a cloud-based service that makes it easy to authenticate and authorize website. the ID it returns will be different from the ID returned when the same user was authenticated by Windows Live ID through a different ACS instance. ACS integrates with Windows Identity Foundation (WIF) tools and environments and Microsoft Active Directory Federation Services (ADFS). but also to the combination of ACS instance and user.1. If a user is authenticated through Windows Live ID in one ACS instance. When a user requests authentication from a web browser. You can configure roles and permissions for Service Bus Queues and Service Bus Relay endpoints within ACS.

Clients accessing Service Bus Relay endpoints or Service Bus Queues must obtain a token from ACS that contains claims defining the client's roles or permissions (1 and 2). created automatically. Service Bus validates the token and allows the request to pass to the required service (4). To view and configure the settings and rules for an ACS namespace that authenticates Service Bus requests. You can also configure the ACS namespace programmatically using the Management API. which is used to authenticate Service Bus You cannot configure unauthenticated access for Service Bus Queues. For every Service Bus namespace created in the Windows Azure portal there is a matching ACS namespace. Service Bus integrates with ACS for authenticating requests. navigate to the Service Bus namespace in the Azure portal and click the “Access Control” icon at the top of the window. The only exception is if you specifically configure a Service Bus Relay endpoint to all allow unauthenticated access. “Implementing Cross-Boundary Communication” covers Service Bus in more detail. and the requests and messages are passed to your service or message queue. and present this token to Service Bus (3) along with the request for the service or access to a message queue. It has the Service Bus namespace as its realm. Client Authentication Figure 4 shows the overall flow of requests for authenticating a client and accessing an on-premises service through Service Bus. but you can add additional identity providers that have different Chapter 6. see “Access Control Service 2. and the patterns & practices “Claims Based Identity & Access Control Guide” at http://claimsid. This ACS namespace is the Service Bus namespace with the suffix “-sb”. .0” at http://msdn.For more information about Windows Azure Access Control Service.aspx. in this section you will see how the integration with ACS is used to authenticate Service Bus requests. It creates a virtual endpoint in the cloud that users connect Windows Azure Service Bus Authentication and Authorization Windows Azure Service Bus is a technology that allows you to expose internal services through a corporate firewall or router without having to open inbound ports or perform address mapping. You cannot change these settings for the default internal identity provider. and generates Simple Web Tokens (SWT) with an expiration time of 1200 seconds. You can also configure access rules and rule groups for your services within this ACS namespace.

you may need to provide separate credentials within the service call headers that allow the client to access specific resources within the server that is hosting the service. Service Bus Relay simply provides a routing mechanism that can expose internal services in a secure way. ACS is an STS that can be configured to trust other identity such as your own corporate domain. other authentication and message security issues are the same as when calling a service directly. the claims can allow a client to register an endpoint to listen for requests. when using Service Bus Queues the token may contain claims that allow the client to send messages to a queue. it is possible that the service itself (in Figure 4. see “Securing Services” in the MSDN WCF documentation at http://msdn. This is useful where you want to be able to authenticate clients in a specific domain. The token issued by ACS can contain multiple claims.aspx and “Securing and Authenticating a Service Bus Connection” at http://msdn. For example. When using Service Bus Relay.aspx. For more information. the techniques you use to protect and secure the service itself and the resources it uses are no different from when you expose it directly over a The ACS token simply allows access to the service through Service Bus. send requests to a service. the internal on-premises service) will require additional credentials not related to Service Bus. and manage the endpoint. and manage a queue. listen on a For example. to manage access to . The authentication process for accessing a Service Bus Relay endpoint is entirely separate from any other authentication and security concern related to calling the service itself.Figure 4 Authenticating a Service Bus request with ACS To allow unauthenticated access to a Service Bus Relay endpoint you set the RelayClientAuthenticationType property of the WCF relay binding security element to None.

Windows Live ID or Google. Service Bus validates the token and allows the request to pass to the required service (6). Figure 5 Authenticating a Service Bus request with ADFS and ACS A similar approach can be taken with other identity providers. A client first obtains a token from the corporate ADFS and Active Directory service (1 and 2). In this approach. for example. allowing clients to choose which one to use. you may expose a service that external employees and partners use to submit expenses. This token can be one of the following types:  Shared secret. The client can then present the ACS token with the service request to Service Bus (5). This allows clients to authenticate with. ACS validates the token presented by the user with ADFS. Service Bus Tokens and Token Providers Unless the endpoint is configured for unauthenticated access. For example. The service . clients must provide a token to ACS when accessing Service Bus endpoints and queues.internal services exposed through Service Bus. such as social identity providers or custom STSs. Each valid user will have an existing account in your corporate directory. Figure 5 shows this approach. the client presents a token containing the service identity and the associated key to ACS to obtain an ACS token to send with the request to Service Bus. then presents this token to ACS (3). and exchanges it for an ACS token containing the appropriate claims based on transformations performed by ACS rules (4). The ACS namespace used by Service Bus can also be configured with several identity providers. ACS is configured to trust the ADFS instance as an identity provider.

This allows you to set up granular control of permissions to each endpoint. or to groups of endpoints. In this approach. All of these types contain methods that accept an instance of a concrete class that inherits the abstract TokenProvider base you can define additional relying parties within the ACS namespace that correspond to endpoint addresses you add to the Service Bus http://treyresearch.identity is configured in the ACS Service Bus namespace.servicebus. or TransportClientEndpointBehavior classs. or from existing tokens.servicebus. This is the approach shown in Figure 4 earlier in this chapter. In this http://treyresearch. Figure 5 earlier in this chapter also shows the overall process for this approach. For more information. if your root namespace in Service Bus is The concrete TokenProvider implementations contain methods that create the corresponding type of token from string values and byte arrays.servicebus. . It is effectively the root of your Service Bus namespace. see “TokenProvider Class” on MSDN at http://treyresearch.  To include the required token with a request. This relying party has a realm value that encompasses all endpoint addresses you define within the namespace. and applies the permissions specified for that definition. NamespaceManager. the client obtains an SWT from an identity provider and presents this to ACS to obtain an ACS token to send with the request to Service the client obtains a SAML token from an identity provider and presents this to ACS to obtain an ACS token to send with the request to Service because there is no more specific match available. For You can create multiple relying party definitions in ACS that correspond to any subsection of the endpoints defined in Service Bus. you might define additional endpoints such as the following:     http://treyresearch. perhaps to implement a federated authentication mechanism if this is ACS matches a request with the longest matching definition. Figure 5 earlier in this chapter shows the overall process for this approach. SAML Token.servicebus. such as would be used for requests to http://treyresearch. For example. SharedSecretTokenProvider. from the list of shown above. the definition for http://treyresearch. However. Service Bus Endpoints and Relying Parties ACS automatically generates a default relying party in the corresponding ACS namespace when you configure a namespace in Service Bus.servicebus. You can create custom implementations of TokenProvider to perform other approaches to token creation. which can generate a suitable key that the client will  Simple Web Token (SWT). the client uses the Service Bus MessagingFactory. and SimpleWebTokenProvider.

action. For a comprehensive list of tutorials on using ACS. Yahoo!. You can download the example code by following the links at http://wag. For each relying party definition you specify rules that transform the claims in the token received by ACS. and have the values Send. Listen. The default owner identity should be used only for administrative tasks. and Manage permissions to the service owner and the root relying party definition. How Trey Research Authenticates Users and Authorizes Requests The Trey Research example application that is available in conjunction with this guide demonstrates many of the scenarios and implementations described in this chapter. To make configuration easier. You can add more than one claim to any Figure 6 shows a high-level view of the authentication and authorization process in the Orders application. For a step-bystep guide to configuring rules and rule see “ACS How To's” on MSDN at http://msdn. . see “How To: Implement Token Transformation Logic Using Rules” on MSDN at http://msdn. including configuring identities and identity providers.aspx. you should create additional service identities within ACS for clients. and Manage) to each one to restrict access to the minimum required. you can create rule groups and apply these to multiple relying parties. and/or rules that add claims. Now would be a good time to download the example so that you can examine the code it contains as you read through this section of the guide. Listen.aspx. and Manage. and Google. You'll see details of the classes identified in the schematic in the following sections of this chapter. ACS automatically creates a default rule that provides Send. When using the shared secret approach for authentication. The output claims are of the type net.Authorization Rules and Rule Groups The ACS Service Bus namespace contains rules and rule groups that specify the roles and permissions for clients based on their authenticated and to authenticate and authorize access to Service Bus queues and topic Authenticating and Authorizing Website Visitors Trey Research authenticates visitors to the Orders website through claim-based authentication using Windows Azure Access Control Service (ACS) as the STS and three social identity providers: Windows Live ID. there is no existing token and so there are no existing input claims. The following sections of this chapter describe the implementation Trey Research uses to authenticate and authorize visitors to the website. and assign the appropriate permissions (Send.servicebus.

Google identity provider Yahoo! Identity provider . Default rule group contains a rule to pass the visitor's ID through as the NameIdentifier and a rule to pass the visitors' email address through as the UserName. The following table shows how Trey Research configured ACS for authentication visitors to the Orders application.Setup project of the example) to configure ACS without needing to use the Windows Azure web portal. Default rule group contains a rule to pass the visitor's ID through as the NameIdentifier and a rule to pass the visitors' email address through as the UserName.Figure 6 Overview of visitor authentication and authorization in the Orders application Access Control Service Configuration Trey Research uses a setup program (in the TreyResearch. ACS artifact Relying party Windows Live ID identity provider Setting Orders website Default rule group contains rules to pass the visitors' ID through as both the NameIdentifier and the UserName (Windows Live ID does not reveal visitors' email addresses).

. <httpModules> <add name="WSFederationAuthenticationModule" type=".Website project shows the settings applied by the WIF STS wizard in Visual Studio. The following extracts from the Web....web> .0.identityModel" type=". </configSections> .0.. <system.ServiceManagementWrapper..The setup program uses the classes in the ACS. <authentication mode="None" /> .0. <system..accesscontrol.webServer> ." /> </httpModules> </system.webServer> <validation validateIntegratedModeConfiguration="false" /> <modules runAllManagedModulesForAllRequests="true"> <add name="WSFederationAuthenticationModule" type=".. see “Access Control Service Samples and Documentation” at http://acs..config file in the Orders.. For more information about the ACS.... Authentication with Windows Identity Foundation Trey Research uses Windows Identity Foundation (WIF) to check for a valid token containing claims when each visitor accesses the website.... windows." preCondition="managedHandler" /> <add name="SessionAuthenticationModule" type="" /> </audienceUris> <federatedAuthentication> <wsFederation passiveRedirectEnabled="true" issuer="https://treyresearch.." /> <add name="SessionAuthenticationModule" type="..web> ." preCondition="managedHandler" /> </modules> </system." /> ..ServiceManagementWrapper project to access and configure ACS. <microsoft..1" requireHttps="true" /> <cookieHandler requireSsl="true" /> </federatedAuthentication> <applicationService> .identityModel> <service> <audienceUris> <add value="" realm="https://127. XML <configSections> <section name="microsoft. such as Windows Live ID or Google) and the Name The GetOriginalIssuer method simply returns the name of the original claim issuer for this" /> </trustedIssuers> </issuerNameRegistry> <certificateValidation certificateValidationMode="None" /> </service> </microsoft... windows. the modules extract the claims and make them available to the application code so that it can use the claims to authorize user actions.identityModel section specify that the modules will redirect requests that require authentication to https://treyresearch. and specifies the thumbprint of the certificate that the WIF modules can use to validate a token sent by the visitor. When the WIF modules detect a request from a visitor that must be authenticated they first check for a valid token from the trusted issuer in the request. The version of the Web.CC7E90AE802251A" name="https://treyresearch.config file you see here is used when the application is running in the local Compute Emulator. ConfigurationBasedIssuerNameRegistry. such as when deploying it to the cloud after testing in the Local Compute environment.Website. If a valid token is present.IdentityModel."> <trustedIssuers> <add thumbprint="B59E49E9EF1. C# private const string ClaimType = . though this is optional.xmlsoap. The settings in the microsoft. The class named IdentityExtensions in the identity/claims/name" optional="true" /> </claimTypeRequired> </applicationService> <issuerNameRegistry type="Microsoft. and then concatenates them to create a federated user name. If one is not present. The trustedIssuers section specifies that the application trusts ACS.Helpers project contains two methods that Trey Research uses to extract the values of claims. The GetFederatedUsername method extracts the value of the IdentityProvider claim (the name of the original claim issuer for this visitor. The applicationService section shows that the Orders website accepts a Name claim in the token presented to the WIF modules.identityModel> These settings insert the two WIF modules WSFederationAuthenticationModule and SessionAuthenticationModule into the HTTP pipeline so that they are executed for each request.config file for the audience URI and relying party realm if you change the URL of your application. the modules redirect the request to ACS. so the audience URI and realm is the local computer. which is the namespace Trey Research configured in ACS.<claimTypeRequired> <claimType type="http://schemas. . When using WIF to authenticate visitors you must edit the values in the Web.

ClaimType == ClaimType). var userName = == ClaimType).Name).Claims. shown below.Single( c => c.Website project. return userName. originalIssuer.NET request validation" + "/2010/07/claims/identityprovider".NET request validation mechanism. Instead. ASP. RequestValidationSource requestValidationSource. if (requestValidationSource == RequestValidationSource. string collectionKey.Value. public static string GetFederatedUsername(this IClaimsIdentity identity) { var originalIssuer = identity. } ASP. Trey Research uses a custom class that allows requests to contain security tokens in XML format.Format( CultureInfo.NET Request Validation By default. it allows the standard ASP.Form && . it allows the request to be processed by returning true. If not. } public static string GetOriginalIssuer(this IClaimsIdentity identity) { return identity. "{0}-{1}".Claims. identity. C# public class WsFederationRequestValidator : RequestValidator { protected override bool IsValidRequestString( HttpContext context. use a custom request validator that allows requests to contain XML documents (which hold the user claims) to be processed.InvariantCulture. The code."http://schemas. The custom class is named WsFederationRequestValidator and is defined in the Orders.Single( c => c.Value. If it is. out int validationFailureIndex) { validationFailureIndex = 0.NET request validation handler to validate the request content. It is not a good idea to turn off the default ASP. This includes HTML and XML elements with opening and closing “<“ and “>“ delimiters. This is a better alternative to turning off ASP. but still protects the site from other dangerous input. checks if the request is a form post containing a SignInResponseMessage result from a WS Federation request.NET checks for dangerous content in all values submitted with requests. string value.

web> Visitor Authentication and Authorization Some pages in the Orders application.ActionLink("Admin". collectionKey. Trey Research allows visitors to log on and log off using links at the top of every page.IsAuthenticated) { <li>@Html.. <system.config file for the Orders website. </system. These are defined in the layout page that acts as a master page for the entire site.Result. "MyOrders")</li> <li>@Html. "Index".ActionLink("Log Out". as shown here. named _Layout.ActionLink("Log On". managing a visitor's account. "LogOff".cshtml and located in the Views/Shared folder of the Orders website. contains the following code and markup. StringComparison.IsValidRequestString(context.Equals( WSFederationConstants.web> <httpRuntime requestValidationType="Orders.Website. } } return base. } } Trey Research inserts the custom request validator into the HTTP pipeline by adding it to the Web. "Account")</li> if (User. WsFederationRequestValidator. out validationFailureIndex).collectionKey. "StoreManager")</li> } } else { <li>@Html.Identity.Parameters.CreateFromFormPost( context.Website" /> . "Account")</li> } ..Ordinal)) { if (WSFederationMessage. requestValidationSource. value. do not require visitors to be authenticated.ActionLink("My Orders". Orders. such as the Home page and the list of products for sale.IsInRole("Administrator")) { <li>@Html. the pages related to placing an order. and viewing existing orders require the visitor to be authenticated. "Index".Request) as SignInResponseMessage != null) { return true. The layout page. CSHTML @if (User. However. "LogOn". visitors can browse these pages anonymously.

It extends the standard AuthorizeAttribute class that is used to restrict access by callers to an action method. } Using a custom authentication attribute is a good way to perform other tasks related to authentication. } Notice that the LogOn method carries the AuthorizeAndRegisterUser attribute. public AuthorizeAndRegisterUserAttribute() : this(new CustomerStore()) { } public AuthorizeAndRegisterUserAttribute( ICustomerStore customerStore) . without needing to include this code in every class.. The code can check if a user is authenticated simply by reading the IsAuthenticated property. The ActionLink that generates the “Log On” hyperlink loads the AccountController class (in the Orders. This is a custom attribute that Trey Research uses to indicate the pages that require users to be authenticated. and assign this to the current context's User. The following code shows an outline of the MyOrdersController class.Identity. as shown in the code above.Identity property. shown below.The WIF modules automatically populate an instance of a class that implements the IClaimsIdentity interface with the claims in the token presented by a visitor. You can also see this attribute used in the MyOrdersController and CheckoutController classes where it is added to the entire class instead of a single method. You can also use all of the other methods and properties of the User.Controllers folder of the website project). C# [AuthorizeAndRegisterUser] public class MyOrdersController : Controller { . such as obtaining a user's details from a database.CustomAttributes folder of the website project.. It also makes it easy to specify which pages or resources require users to be authenticated.Website. such as testing if the visitor is in a specific role. as show here. C# [HttpGet] [AuthorizeAndRegisterUser] public ActionResult LogOn() { return RedirectToAction("Index". "Home"). C# public class AuthorizeAndRegisterUserAttribute : AuthorizeAttribute { private readonly ICustomerStore customerStore. The custom AuthorizeAndRegisterUserAttribute class. The Logon method defined in the AccountController class redirects the user to the Home page of the site. is defined in the Orders.Website.

customerStore = customerStore.Result = new RedirectToRouteResult(redirectInfo). to authenticate with ACS. } var federatedUsername = ((IClaimsIdentity)filterContext.Web.Url.User.GetFederatedUsername().Add("controller". var redirectInfo = new RouteValueDictionary(). } } } The AuthorizeAndRegisterUserAttribute attribute creates an instance of the CustomerStore repository class that is used to access the data for this customer in the SQL Azure Customers database (as shown in Figure 6 earlier in this chapter). If this visitor has not yet registered.OnAuthorization(filterContext). filterContext. filterContext.AuthorizationContext filterContext) { if (!filterContext.Add("returnUrl".{ this.HttpContext. Customer Details Storage and Retrieval Using claims-based identity with social identity providers usually means that your application must maintain information about each registered user of the application because the token that authenticated users present to the application contains only a minimal set of claims (typically just a user identifier or name).Identity). Trey Research maintains a Customers database in SQL Azure that contains the details customer provide when they register.HttpContext. var customer = this. If the visitor is not authenticated. the attribute passes the request to the base AuthorizeAttribute instance which forces the user. . } public override void OnAuthorization( System. if (customer == null) { // Redirect to registration page. "Register"). redirectInfo. though the WIF modules. and additional information that is maintained for each customer by the head-office administration team (such as the credit limit).customerStore. "Account").IsAuthenticated) { base. redirectInfo. If the visitor is authenticated.Request. the attribute uses the GetFederatedUsername method you saw earlier to get the user name from the authentication token claims.AbsolutePath). there will be no existing details in the Customers database and so the code redirects the visitor to the Register page. return.FindOne( federatedUsername). HttpContext.Add("action".User. and the FindOne method of the CustomerStore class to get this visitor's details from the Customers database.Mvc. Identity. redirectInfo.

Phone }.UserName == userName)). State = customer. Address = customer. Implementing Reliable Data Access with the Transient Fault Handling Application Block You can see that the FindOne method shown in the previous code first obtains a reference to the database by calling the CreateContext method of the TreyResearchModelFactory class (located in the DataStores folder of the Orders. located in the Orders. C# public Customer FindOne(string userName) { using (var database = TreyResearchModelFactory.Website project). } } This method looks for an entity using the federated user name.CustomerId. This is the code of the FindOne method.DataStores folder of the website project.LastName.Website. LastName = customer. C# public static TreyResearchDataModelContainer CreateContext() { . including the FindOne method used in the previous code extract.Email. FirstName = customer.PostalCode.ExecuteAction( () => database.FirstName. it returns null to indicate that this visitor is not registered. if (customer == null) { return null. This method uses another class named CloudConfiguration to read the connection string for the database from the ServiceConfiguration. Email = customer.Customers.State. or return null if the setting is not found.City. Country = customer. Trey Research implemented a Microsoft Entity Framework-based repository in the class named CustomerStore. If not.CreateContext()) { var customer = this.To connect to the database. This class exposes several methods for storing and retrieving customer details. City = customer. Phone = customer. UserName = customer. PostalCode = customer.Country.cscfg file.SingleOrDefault( c => c. } return new Customer { CustomerId = customer. If it finds a match.Address.UserName. it creates a new Customer instance and populates it with the data from the database.sqlCommandRetryPolicy.

GetConfigurationSetting( "TreyResearchDataModelContainer". XML <RetryPolicyConfiguration defaultRetryStrategy="Fixed Interval Retry Strategy" defaultAzureStorageRetryStrategy ="Fixed Interval Retry Strategy" defaultSqlCommandRetryStrategy="Backoff Retry Strategy" > <incremental name="Incremental Retry Strategy" retryIncrement="00:00:01" initialInterval="00:00:01" maxRetryCount="10" /> <fixedInterval name="Fixed Interval Retry Strategy" retryInterval="00:00:05" maxRetryCount="6" firstFastRetry="true" /> <exponentialBackoff name="Backoff Retry Strategy" minBackoff="00:00:05" maxBackoff="00:00:45" deltaBackoff="00:00:04" maxRetryCount="10" /> </RetryPolicyConfiguration> You can see that it contains a default strategy that retries the action every second for a maximum of ten attempts. the code uses the Enterprise Library Transient Fault Handling Application Block to execute the method that connects to the data store and retrieves the details for the specified customer. with the first retry occurring immediately after a failure.config null)). Service Bus Queues. and BLOBs) that retries the action every five seconds for a maximum of six attempts. and then adds 4 seconds to the delay between each attempt up to a maximum of 45 seconds delay and ten attempts.return new TreyResearchDataModelContainer( CloudConfiguration.” describes how the Transient Fault Handling Application Block can be used to manage connectivity to SQL Azure databases. The following extract from the file shows the definition of the policy. It exposes events that you can use to monitor the execution and collect information about execution failures. For more information. Service Bus Topics. Queues. and Azure Cache. Although they could have used code to generate the policy dynamically at runtime. The Transient Fault Handling Application Block provides a ready-built mechanism for attempting an action a specified number of times with a specified delay between each attempt. The section “Implementing Reliable Connectivity” in Chapter 4. there is a strategy for SQL database connections that retries the action after 5 seconds. Finally. There is also a strategy for accessing Windows Azure Storage (Tables. see “Microsoft Enterprise Library” at http://msdn. “Deploying Functionality and Data to the Cloud. the developers at Trey Research chose to define the retry policy that the block will use in the application's Web. The CustomerStore class that contains the FindOne method initializes the retry policy in its class constructor using the static GetDefaultSqlCommandRetryPolicy method of the RetryPolicyFactory class (which is part of .microsoft. The block is part of the Enterprise Library Integration Pack for Windows Azure. } After obtaining a reference to the database.

it will throw an exception that you can handle in your code. the code simply needs to call the ExecuteAction method. For more details of all the overloads.Delay. args. and perform the execution asynchronously. transport partners. this.CurrentRetryCount. the action to execute is the SingleOrDefault method on the Customers collection exposed by the TreyResearchDataModelContainer class. .UserName == userName)).aspx.the Transient Fault Handling Application Block and is exposed when you add the block to your application).practices. Figure 7 shows an overview of the communication between these and the on-premises audit logging service. specifying as a parameter the action to execute." + "Delay:{1}. The constructor also adds a handler for the Retrying method of the block.TraceInformation("Retry . This handler writes information to Windows Azure diagnostics each time the block detects a connection failure.ExecuteAction( () => database. C# var customer = this. This code uses the simplest of the ExecuteAction overloads available in the Transient Fault Handling Application Block. see “ExecuteAction Method” at http://msdn.sqlCommandRetryPolicy.sqlCommandRetryPolicy = RetryPolicyFactory.sqlCommandRetryPolicy. If the Transient Fault Handling Application Block cannot complete the specified action within the specified number of attempts. args. Exception:{2}".Retrying += ({0}. In this case.GetDefaultSqlCommandRetryPolicy(). args. With the retry policy defined and everything initialized. C# this. Authenticating and Authorizing Service Bus Access Trey Research uses Service Bus Topics and Queues to pass messages between the Azure-hosted Orders application. Other overloads allow you to return a result from the method.Customers. args) => TraceHelper.retrypolicy.SingleOrDefault( c => c.

Figure 7 Overview of Service Bus authentication in the Orders application When a new order is placed the Orders application sends the order summary to a Service Bus Topic named NewOrdersTopic. . after delivering the order. Trey Research uses the same queue between the transport partners and the Orders application to send acknowledgements when receiving an order advice. For example. Trey Research also defined a Service Bus Queue named OrderStatusUpdateQueue that transport partners use to send messages to the Orders application. and update messages to indicate delivery has been made. In addition. All communication between the Orders application and the Service Bus Topics and Queues takes place in Worker Roles running in the Azure datacenter. All Worker Role instances within the same application use the same instances of these Topics and Queues.00. This topic currently has three subscribers: the transport partners that must be advised when an order is placed so that they can arrange delivery (in the schematic there are two transport partners). A filter defined for the Service Bus Topic sends messages to the audit log subscriber only if the value is greater than $10. Each transport partner sends an acknowledgement message when it receives a new order summary from the Orders application. It makes sense to reuse Service Bus queues where possible to simplify your application configuration and minimize the costs you incur (you are charged for each queue you use).000. which updates the status of the order in the Orders database. and the on-premises Audit Log service. transport partners send another message to the Orders application.

and between each hosted instance and all of the transport partners. The following table shows how Trey Research configured ACS for Service Bus authentication in the Orders application. "Man in a Van" and "Worldwide" shipping partners. NewOrderJob. Name: ServiceBus. and "Listen" action claims. Indicate that the order has been delivered. the administrators implement additional Topics and Queues to connect the hosted instances in the different datacenters with the onpremises Audit Log service. Setting AuditLogListener. Name: OrderStatusUpdateQueue Claim issuer: ACS. Claim issuer: ACS. WorkerRole. Service Bus Topic (relying party) for sending new order details to transport partners and the onpremises audit log. . Transport partner (relying party) for local deliveries. Rule groups:  Default rule group containing: If name identifier="owner" emit "Manage". Token type: SWT. StatusUpdateJob. ManInAVan. Rule groups:   Default rule group for ServiceBus Rule group containing: If name identifier="NewOrderJob" emit "Send" action claim. HeadOffice. Name: ManInAVan Claim issuer: ACS. Name: NewOrdersTopic Claim issuer: ACS. ExternalDataAnalyzer. Rule groups:     Default rule group for Service Bus. the following sections of this chapter describe the case where the Orders application is deployed to only one datacenter.When Trey Research deploys the application to multiple datacenters. Worldwide. Token type: SWT. owner.Setup project of the example) to configure ACS without needing to use the Windows Azure web portal. Token type: SWT. Rule group containing: If name identifier="StatusUpdateJob" emit "Listen" action claim. "Send". NewOrdersTopic. Default Service Bus relying party. Rule group containing: If name identifier="ManInAVan" emit "Listen" action claim. Service Bus artifact Service identities. Access Control Service Configuration Trey Research uses a setup program (in the TreyResearch. For clarity. Rule group containing: If name identifier="Worldwide" emit "Send" action claim. Audit log service. Rule groups:   Default rule group for ServiceBus. Subscriptions:   Service Bus Queue (relying party) that transport partners use to send messages to the Orders application that:   Acknowledge receipt of new order details. Rule group containing: If name identifier="ManInAVan" emit "Send" action claim. Token type: SWT.

so that the owner has full access to these Topics. Worker Roles can send messages to the NewOrdersTopic using the NewOrderJob identity and the subscribers “Man in a Van”. along with the default Service Bus rules. and subscriptions are created are described in Chapter 6. or a Queue. The default owner identity is created. “Implementing Business Logic and Message Routing across Boundaries. “Worldwide” and the Audit Log service can listen on this Topic. These action permissions are not inherited by default. Rule group containing: If name identifier="Worldwide" emit "Listen" action claim. Token type: SWT. All of the relying parties inherit this default rule group. and the Worker Roles can listen on this queue using the StatusUpdateJob identity. The ways that the queues. Collects diagnostic information and sends data to the external data analyzer service. Each has a unique password that a listener or sender must provide to prove its identity when connecting to a Topic. Send. “Man in a Van” and “Worldwide” can send messages to the OrderStatusUpdateQueue. or manage the queue using the owner account. send. On-premises management and monitoring application relying party. The service identities are defined within ACS. Rule group containing: If name identifier="HeadOffice" emit "Listen" action claim. Token type: SWT. For example. Rule groups:    Default rule group for ServiceBus. Rule group containing: If name identifier="ExternalDataAnalyzer" emit "Send" action claim. topics. The two transport partners. Name: Worldwide Claim issuer: ACS. These rules add Manage. The remainder of the table shows the rules that are specific to the other relying parties. On-premises business reports relying party. Rule groups:   Default rule group for ServiceBus.Transport partner (relying party) for distance deliveries.” and Chapter 7. Subscribes to Topic to collect audit log messages.” . Rule groups:   Default rule group for ServiceBus. Topic subscriptions. a Topic subscription. when you create the Service Bus namespace. and Listen action claims to the token that ACS generates when authenticating the owner identity. Name: OrdersStatisticsService Claim issuer: ACS. Rule group containing: If name identifier="AuditLogListener" emit "Listen" action claim. “Implementing Cross-Boundary Communication. Token type: SWT. Name: AuditLogListener Claim issuer: ACS. and Queues. Remember to include the default Service Bus Rule Group in the list of rule groups for Queues and Topics if you will need to listen.

tokenProvider = TokenProvider .Authenticating When Creating and Connecting to a Queue or Topic Trey Research uses the Simple Web Tokens (SWTs) to authenticate connections to Queues and Topics. This means that the sender or receiver must provide a suitable token that contains the name and password of the identity to use when connecting to a Queue. C# public class ServiceBusQueueDescription { public string Namespace { get. The remaining code creates a queue.ToLowerInvariant().description = description.description. The class ServiceBusQueueDescription in the Orders.senderAdapter = new MessageSenderAdapter(sender). ReceiveMode.Communication folder.description. This class accepts a populated ServiceBusQueueDescription instance and exposes methods to create queues. The constructor of the class uses the static CreateSharedSecretTokenProvider method of the Service Bus TokenProvider class to create a suitable Shared Secret token provider. plus a sender and receiver for the queue.Communication folder provides a class that defines the description of a queue. Topic subscription. set. } public string QueueName { get.CreateServiceUri("sb". Trey Research uses two custom helper classes that make it easier to create Service Bus Queues with the required configuration.description. } public string DefaultKey { get. string. set. or Topic.Shared.Namespace. var sender = messagingFactory.description. C# public ServiceBusQueue(ServiceBusQueueDescription description) { Guard.CreateMessageReceiver( this.ToLowerInvariant()).receiver = rec.tokenProvider). set. this. this. including the identity of the relying party (the Issuer) and the password (the DefaultKey).QueueName. set.QueueName. var messagingFactory = MessagingFactory.Create( runtimeUri.description.PeekLock). } .CheckArgumentNull(description.DefaultKey). "description"). this. } } The second custom class Trey Research uses is the ServiceBusQueue class.Issuer.CreateSharedSecretTokenProvider( this. this. this. this.Empty).Shared.CreateMessageSender( this. this. var runtimeUri = ServiceBusEnvironment. var rec = messagingFactory. } public string Issuer { get. defined in the Orders.

Trey Research uses SWT tokens to authenticate relying parties (senders and receivers) for the queues used in the Orders application.Empty). Trey Research uses the following code to create an SWT in the connector and adapter they use to connect to transport partners.cscfg file and create a new instance of the ServiceBusQueue class. The table earlier in this section of the chapter summarizes the identities and permissions for all of the relying parties. The WorkerRole identity named StatusUpdateJob has Listen permission for the OrderStatusUpdateQueue queue. .Workers project is used by the Worker Role when it creates the task that reads status update messages sent from transport partners. string. In this case no claims are used. string. This is required because all transport partners will have permission to send messages. string. For example. var orderStatusUpdatesQueue = new ServiceBusQueue(serviceBusQueueDescription). the sender must provide a suitable token to Service Bus that establishes the identity of the sender as defined in ACS (or other identity provider if ACS is not used).Any code that needs to create a queue can then simply read the values for the queue description properties from the ServiceConfiguration.GetConfigurationSetting( "serviceBusNamespace". but Trey Research must ensure that a message does come from a specific transport partner (and not a different partner using a different partner's identity).GetConfigurationSetting( "issuer". Adding SWT tokens to the header and validating them cannot be achieved by the Service Bus objects and ACS configuration.GetConfigurationSetting( "defaultKey". This is handled automatically by the Service Bus objects when you configure the ACS namespace associated with Service Bus. C# var serviceBusQueueDescription = new ServiceBusQueueDescription { Namespace = CloudConfiguration. DefaultKey = CloudConfiguration. transport partners must provide an SWT when connecting to the OrderStatusUpdateQueue and sending messages to the Worker Role in the Orders application. Instead. string.GetConfigurationSetting( "orderStatusUpdateQueue". Creating a Token When Sending Messages to a Queue When sending a message to a Service Bus Queue or Topic. It helps to secure the communication channel by providing tokens with the corresponding claims such as Listen or Send. The following code taken from the StatusUpdateJob class in the Jobs folder of the Orders. as you can see by looking back at the table earlier in this section.Empty) }. SWT tokens help to secure the brokered message by adding a token to the header of the message that ensures the identity of the sender party. Secondly. The use of SWT tokens meets two goals.Empty).Empty). QueueName = CloudConfiguration. Issuer = CloudConfiguration.

Add("wrap_scope".” and Chapter 7.Format( "https://127.QueueName)).accesscontrol.Format( "https://{0}.UTF8. } See Chapter 4.OrdinalIgnoreCase)) . serviceIdentity). and realm to create the appropriate token for this sender's identity.” . byte[] responseBytes = client. This code sends a request for a token to ACS. "POST". return token.0.Split('=')[1]). values).StartsWith("wrap_access_token=". see Chapter 6. the transport partner can send the message to the queue. serviceIdentity: this.GetString(responseBytes).acsPassword.0. client. For more information about sending and receiving messages in Service Bus Queues and Topics. password. } private string GetTokenFromAcs(string acsNamespace. password)." for information about the way that Trey Research uses connectors and adapters to communicate with transport partners. “Implementing Business Logic and Message Routing across Boundaries.9/".windows. values. relyingPartyRealm).BaseAddress = acsNamespace.Single(value => value.Split('&') . string response = Encoding.GetTokenFromAcs( acsNamespace: string. relyingPartyRealm: string. values. string relyingPartyRealm) { // request a token from ACS var client = new WebClient(). string password. password: this. this.UrlDecode( response . var values = new NameValueCollection().C# private string GetToken(ServiceBusQueueDescription queueDescription) { var token = this.Add("wrap_name".acsNamespace). After obtaining a suitable token. StringComparison. values. It uses values extracted from the application's configuration file for the service identity". return HttpUtility.Add("wrap_password".1/{0}/".UploadValues( "WRAPv0.TransportPartnerDisplayName. “Implementing Cross-Boundary Communication. string serviceIdentity. "Deploying Functionality and Data to the Cloud. queueDescription.

the StatusUpdateJob class in the Jobs folder of the Orders.Validating Tokens in Received Messages The token that a transport partner creates and inserts into the message headers when sending a message can be used to verity the identity of the partner that sent it. . and it supports single sign-on and a choice of identity providers to make authentication easier for users. and the signing key. The IsValidToken method used in the code above is defined in the StatusUpdateJob class. the chapter describes how the fictional organization named Trey Research implemented authentication and authorization in their hybrid Orders application using the example code that is available for this guide. and Windows Azure Access Control Service (ACS). The IsValidToken method calculates or retrieves all of these values before attempting to validate the token. // This will send the message to the Dead Letter queue. The chapter focuses on the use of claims-based identity and authentication. and describes how they integrate to provide an ends-to-end authentication and authorization solution. This is most common and fastest-growing techniques for Azure-hosted applications because they can take advantage of the frameworks and services specifically designed to make this approach easier. if (!string. (string)token)) { // Throw exception to be caught by handler.IsNullOrWhiteSpace(token) && !this.. the service namespace.. The TokenValidator requires several additional pieces of information. and it uses a separate class named TokenValidator in the Orders. By throwing an exception. the code ensures that invalid messages will be moved to the dead letter queue related to this queue. the audience value.. throw new InvalidTokenException(). Amongst the frameworks and services available for implementing claims-based authentication and authorization are the Windows Identity Framework (WIF).IsValidToken(message. For example.. Claims-based authentication also offers advantages in that it makes it simpler for administrators to manage lists of users and permissions. You can examine the code in the StatusUpdateJob and TokenValidator classes to see how it achieves this. This chapter provides an overview of these technologies.Workers folder. including the ACS hostname. C# . Summary This chapter describes the challenges you typically face when building hybrid applications that must authenticate visitors. It checks that each message is valid using the following code. or when you use Service Bus to communicate between cloud-hosted and on-premises applications and services. } . Finally.Workers project receives messages that contain SWTs from the transport partners. Active Directory Federation Services (ADFS).OrderId. It is also a vital technique for using services such as Windows Azure Service Bus Topics and Queues.

see “Service Bus Authentication and Authorization with the Access Control Service” at http://msdn.codeplex. For more links and general information about the technologies discussed in this chapter see Chapter “Introduction” of this guide.More Information For links and information about the technologies discussed in this chapter see Chapter and “Securing and Authenticating a Service Bus Connection” at http://msdn. . For more information about Service Bus and ACS “Introduction” of this guide and the patterns & practices “Claims Based Identity & Access Control Guide” at

performance is variable. A solution based on this architecture typically involves implementing one or more of the following generic use cases. and can have a significant bearing on the entire systems design. carefully managed and protected by the organizations responsible for hosting them. either in the cloud or at partner organizations. Consequently the network is the weak link. Although the individual components of a distributed solution typically run in a controlled environment. Accessing On-Premises Resources From Outside the Organization Description: Resources located on-premises are required by components running elsewhere. This chapter describes these technologies. Uses Cases and Challenges In a hybrid cloud-based solution. and all communications must be carefully protected. A typical distributed application contains many parts running in a variety of locations. and explains how Trey Research applied them to its solution. the various applications and services will be running on-premises or in the cloud and interacting across a network. the network that joins these elements together commonly utilizes infrastructure.6 Implementing Cross-Boundary Communication A key aspect of any solution that spans the on-premises infrastructure of an organization and the cloud concerns the way in which the elements that comprise the solution connect and communicate. that is outside of these organizations’ realms of responsibility. Any distributed solution must be able to handle intermittent and unreliable communications while ensuring that all transmissions are subject to an appropriate level of security. such as the Internet. Each of these use cases has its own series of challenges that you need to consider. Windows Azure provides technologies that address these concerns and help you to build reliable and safe solutions. connectivity between components is not guaranteed. which must be able to interact in a safe and reliable manner. . Making the most appropriate choice for selecting the way in which components communicate with each other is crucial.

you must open the appropriate port(s) to allow inbound requests. . In a typical service-based architecture running over the Internet. Accessing On-Premises Services From Outside the Organization Description: Services running on-premises are accessed by applications running elsewhere. and must be able to connect back to the on-premises servers in a safe and secure manner to read or modify the on-premises resources. The local port selected for the outbound call from your on-premises application depends on the negotiation performed by the HTTP protocol (it will probably be some high-numbered port not currently in use). Remember that the purpose of this firewall is to help guard against unrestrained and potentially damaging access to the assets stored on-premises from an attacker located in the outside world. the code for these items frequently has direct and controlled access to these resources by virtue of running in the same network segment.The primary challenge associated with this use case concerns finding and connecting to the resources that the applications and services running outside of the organization utilize. most organizations implement a policy that restricts inbound traffic to their on-premises business servers. applications running on-premises within an organization access services through a public-facing network. in the case of most Web-based services this will be port 80 over HTTP. you are then faced with the task of filtering the traffic to detect and deny access to malicious requests. The environment hosting the service makes access available through one or more well-defined ports and by using common protocols. blocking access to your services. When your application running on-premises connects to the service it makes an outbound call through your organization’s firewall. for security reasons. either in the cloud or at partner organizations. you are effectively turning this communication through 180 degrees. applications running in the cloud and partner organizations need to make an inbound call through your organization’s firewall and. The important point is that to send requests from your application to the service. If the service is hosted behind a firewall. Even if you are allowed to open up various ports. one or more Network Address Translation (NAT) routers to connect to your services. and any responses from the service return on the same channel through the same port. you do not have to open any additional inbound ports in your organization’s firewall. When running on-premises. or port 443 over HTTPS. Therefore. when this same code runs in the cloud it is operating in a different network space. When you run a service on-premises. possibly. However.

When you depend on a public network such as the Internet for your communications. They then probe any services listening on open ports to determine whether they exhibit any common vulnerabilities that can be exploited to break into your corporate systems. Many hackers run automated port-scanning software to search for opportunities such as this. Implementing a Reliable Communications Channel across Boundaries Description: Distributed components require a reliable communications mechanism that is resilient to network failure and enables the components to be responsive even if the network is slow.Figure 1 Firewall blocking access to on-premises services The vital question concerned with this use case therefore. the receiver. Addressing this challenge requires you to consider the following issues:  How is the communications channel established? Which component opens the channel. you are completely reliant on the various network technologies managed by third party operators to transmit your data. Utilizing reliable messaging to connect the elements of your system in this environment requires that you understand not only the logical messaging semantics of your application. A reliable communications channel does not lose messages. but also how you can meet the physical networking and security challenges that these semantics imply. although it may choose to discard information in a controlled manner in well-defined circumstances. is how do you enable access to services running on-premises without compromising the security of your organization? Injudiciously opening ports in your corporate firewall can render your systems liable to attack. the sender. or some third-party? .

You should treat the network as a hostile environment and be suspicious of all incoming traffic. The questions that you must consider when implementing a safe communications channel include:   How do you establish a communications channel that traverses the corporate firewall securely? How do you authenticate and authorize a sender to enable it to transmit messages over a communications channel? How do you authenticate and authorize a receiver? How do you prevent an unauthorized receiver from intercepting a message intended for another recipient?  . you must also ensure that the communications channel used for connecting to a service is well-protected. should the sender be informed that the message has been discarded? Does a message have a specific single recipient. what happens?       Cross-Cutting Concerns Allied to the functional aspects of connecting components to services and data. or can the same message be broadcast to multiple receivers? Is the order of messages important? Should they be received in exactly the same order in which they were sent? Is it possible to prioritize urgent messages within the communications channel? Is there a dependency between related messages? If one message is received but a dependent message is not. how can the receiver transmit a reply to the sender? Do messages have a lifetime? If a message has not been received within a specific period should it be discarded? In this situation. Specifically. then how does the sender know that a message has been received? Is the communications channel duplex or simplex? If it is simplex. Security The first and most important of these challenges is security. and authorize them according to your organization’s data access policy to guard your organization’s resources from unauthorized access. Requests may arrive from services and organizations running in a different security domain from your organization. You should be prepared to authenticate all incoming requests. Do the sender and receiver have a functional dependency on each other and the messages that they send? Do the sender and receiver need to operate synchronously? If not. You must also take steps to protect all outgoing traffic. as the data that you are transmitting will be vulnerable as soon as it leaves the environs of your organization. you also need to consider the common non-functional challenges that any communications mechanism must address.

and then implement the appropriate libraries to format messages in a manner that components built using different technologies can easily parse and process. Azure Technologies for Implementing Cross-boundary Communication If you are building solutions based on direct access to resources located on-premises. and vice versa? Interoperability Hybrid applications combine components built using different technologies. you can use Azure Connect to establish a safe. These scenarios are described in detail in the Patterns and . the communications channel that you implement should be independent of these technologies. utilizing commonly accepted networking protocols such as TCP and HTTP and message formats such as XML and SOAP. Following this strategy not only reduces dependencies on the way in which existing elements of your solution are implemented. Senders and receivers will probably be running on different computers. Responsiveness A well-designed solution ensures that the system remains responsive. A common strategy to address this issue is to select a communications mechanism that layers neatly on top of a standard protocol. on-premises. Maintaining messaging interoperability inevitably involves adopting a standards-based approach. the results can be very costly and users will lose faith in your system. virtual network connection to your on-premises servers. Your code can utilize this connection to read and write the resources to which it has been granted access. or within a third-party partner organization). and located in different parts of the World. You must answer the following questions:    How do you ensure that a sender and receiver can communicate reliably without blocking each other? How does the communications channel manage a sudden influx of messages? Is a sender dependent on a specific receiver. Ideally. How do you protect the contents of a message to prevent an unauthorized receiver from examining or manipulating sensitive data? How do you protect the sender or receiver from attack across the network?  Robust security is a vital element of any application that is accessible across a network. The Azure SDK provides APIs that enable you to access this data from applications running on-premises as well as other services running in the cloud or at partner organizations. hosted in different datacenters (either in the cloud. error-prone network between distant components. You can also deploy data to the cloud and store it in Azure BLOB and Table storage. but also helps to ensure that your system is more easily extensible in the future. even while messages are flowing across a slow. If security is compromised.

The underlying communications mechanism is typically hidden by a proxy object in the sender application. you send messages to these services that access the resources in a controlled manner on your behalf. a service host application exposes collections of objects rather than operations. and one-way messaging where no response is expected. Older applications and frameworks also support the notion of remote objects. It's also a naturally .Practices guide “Developing Applications for the Cloud on the Microsoft Windows Azure Platform” (available at http://wag. If you are following a Service-Oriented Architecture (SOA) approach. and a response sent back. it has the drawback of being potentially very inefficient in terms of network use (client applications send lots of small. you can build services to implement more functionally-focused access to resources. You can build components that provide RPC-style messaging by implementing them as Windows Communication Foundation services. Web services typically follow this pattern. These operations can take parameters and may return results. where the sender creates a message and posts it to the queue. and performance can suffer as a result. it has to wait while the message is received. transmitting the message. the message receiver is typically a service that exposes a set of operations a sender can invoke. the message receiver simply expects to receive a packaged message rather than a request to perform an operation. A client application can create and use remote objects using the same semantics as local objects. This style of messaging is typically based on queues. this proxy takes care of connecting to the receiver. The message provides the context and the data.codeplex. processed.  Message-oriented communications. you can provide safe access to them using the Service Bus Relay mechanism described later in this chapter. In this style. Although this mechanism is very appealing from an object-oriented standpoint. which may impact responsiveness on the part of the sender. chatty network requests). This scenario is described in detail in the Patterns and Practices guide “Developing Applications for the Cloud on the Microsoft Windows Azure Platform“. In this style of distributed communications. and in some simple cases can be thought of as an extension to a local method call except that the method is now running remotely. and the receiver typically parses the message to determine how to process it and what tasks to perform. In this style. If the services must run on-premises. waiting for the response. and the receiver retrieves messages from the queue. If these services must run in the cloud you can host them as Windows Azure roles. This style of communications is not considered any further in this guide. Communication with services in this architecture frequently falls into one of two distinct styles:  Remote procedure call (RPC) style but are not covered in this guide. Variations on this style support asynchronous messaging where the sender provides a callback that handles the response from the receiver. This style of messaging lends itself most naturally to synchronous communications. and then returning this response to the sender.

you may need to configure the firewall of the virtual machine hosted in the VM role. If the sender expects a response. Using Azure Connect. These are covered later in this chapter. Guidelines for Using Azure Connect Using Azure Connect provides many benefits over common alternative approaches:   Setup is straightforward. It implements a network-level connection based on standard IP protocols between your applications and services running in the cloud and your resources located on-premises. Your IT staff does not have to configure VPN devices or perform any complex network or DNS configuration. as agreed with the receiver. it can provide some form of correlation identifier for the original message and then wait for a message with this same identifier to appear on the same or another queue.   You can selectively relocate applications and services to the cloud while protecting your existing investment in on-premises infrastructure. Accessing On-Premises Resources from Outside the Organization Using Azure Connect Azure Connect enables you integrate the virtual machines hosting your Azure roles with your on-premises servers by establishing a virtual network connection between the two environments. and describe when you should consider using each of them. and vice versa. a cloud-based service can use familiar syntax to connect to a file share or a device such as a printer located on-premises. . If you are exposing resources from a VM role. Your IT staff is not required to modify the firewall configuration of any of your on-premises servers or change any router settings. Service Bus Relay. For example. code hosted in the cloud can access on-premises network resources using the same procedures and methods as code executing on-premises.asynchronous method of communications as the queue acts as a repository for the messages. or if you are required to retain these items on-premises for legal or compliance reasons. The following sections provide more details on Azure Connect. Azure Connect is suitable for the following scenarios:  An application or service running in the cloud requires access to network resources located on onpremises servers. You can retain your key data services and resources on-premises if you do not wish to migrate them to the cloud. file sharing may be blocked by Windows Firewall. and Service Bus Queues. which the receiver can select and process in its own time. For example. Azure provides an implementation of reliable message queuing through Service Bus Queues.

The vendors of these applications do not typically supply the source code. Examples include SQL Server. enabling you to protect resources using access control lists (ACLs). . This approach is especially suitable for third-party packaged applications that you wish to move to the cloud. Azure Connect enables a cloud-based service to establish a network connection to data sources running on-premises that enable IP connectivity. Figure 2 Connecting to on-premises resources from the cloud  An application running in the cloud requires access to data sources managed by on-premises servers. or a Microsoft Message Queue.The same security semantics available between on-premises applications and resources apply. so you cannot modify the way in which they work. a DCOM component. Services in the cloud specify the connection details in exactly the same way as if they were running on-premises.

Using Azure Connect to bring cloud services into your corporate Windows domain solves many problems that would otherwise require complex configuration. and that may inadvertently render your system liable to attack if you get this configuration wrong. . This feature enables you take advantage of integrated security when connecting to data sources hosted by your on-premises servers. In this way you can configure corporate authentication and resource access to and from cloud-based virtual machines. Azure Connect enables you to join the roles running in the cloud to your organization’s Windows domain.Figure 4 Connecting to on-premises data sources from the cloud  An application running in the cloud needs to execute in the same Windows security domain as code running on-premises.

. Azure Connect is also a compelling technology for solutions that incorporate roaming users who need to connect to on-premises resources such as a remote desktop or file share from their laptop computers.Figure 5 Joining a role into a Windows domain  An application requires access to resources located on-premises. You can use Azure Connect to implement a simplified VPN between remote sites and partner organizations. but the code for the application is running at a different site or within a partner organization.

Azure Connect can configure the security for your virtual network. you must install the Connect Agent in this role before you deploy it to the cloud. You use the Azure Portal to generate an activation token that you include as part of the configuration for each role and each instance of the Connect Agent running on-premises. The Connect Agents only use outbound HTTPS .Figure 6 Connecting to on-premises resources from a partner organization and roaming computers For up-to-date information about best practices for implementing Azure Connect. The Connect agents establish communications with each other by using the Azure Connect Relay service. visit the Windows Azure Connect Team Blog at http://blogs. the Connect Agents use this token to generate and manage the necessary IPsec certificates.msdn. Similarly. For servers running on-premises. With this token. you download and install the Connect Agent software and no further manual intervention is required. It is installed automatically on roles running in the cloud that are configured as connect-enabled. The Connect Agent transparently handles DNS resolution and manages the IP connections between your servers and roles. the Connect Agent executes in the background as a Windows service. hosted and managed by Microsoft in their datacenters. The connections are protected end-to-end using certificate-based IPsec over SSTP. Azure Connect Architecture and Security Model Azure Connect is implemented as an IPv6 virtual network by a small piece of software known as the Azure Connect Agent running on each server and role that participates in the virtual network. if you are using Azure Connect to connect from a VM role.

microsoft. choose the relay region closest to your organization when you configure Azure Connect. and place the roles and host servers that form part of the same virtual network into the same endpoint group. Figure 7 The security architecture of Azure Connect You manage the connection security policy that governs which servers and roles can communicate with each other using the Windows Azure portal. see "Connecting Local Computers to Windows Azure Roles" on MSDN at http://msdn. For best performance. Do not disable this rule. However. you create one or more endpoint groups containing the various roles and host servers that comprise your These messages are required to establish and maintain an IPv6 link. . For more information about configuring Azure Connect and creating endpoint groups. the Connect Agent creates a firewall rule for ICMPv6 communications which allows Router Solicitation (Type 133) and Router Advertisement (Type 134) messages when it is installed.connections to communicate with the Azure Connect Relay service and there is no need to open any additional incoming ports in your corporate firewall. A server or role can belong to more than one endpoint group.aspx. Microsoft implements separate instances of the Azure Connect Relay service in each region.

and these applications will need to be updated or replaced to function correctly with Azure Connect. These requirements may be subject to approval from your IT department. However. and you organization may implement policies that constrain the configuration of the domain controller. the current version of Azure Connect requires that you install the Connect Agent software on the domain controller. Azure Connect is specifically intended to provide connectivity between roles running in the cloud and servers located on-premises. in this case. You can only use Azure Connect to establish network connectivity with servers running the Windows operating system. but is subject to some constraints as summarized by the following items:  Azure Connect is an end-to-end solution. You can leverage your existing WCF knowledge without needing to learn a new model for implementing services and client applications. or SQL Azure. any applications using Azure Connect to access resources must be IPv6 aware. Consequently. If you require cross-platform connectivity. . these requirements may change in future releases of Azure Connect. Service Bus Relay provides WCF bindings that extend those commonly used by many existing services. you should consider using Azure Service Bus. Azure Connect implements an IPv6 network.     Accessing On-Premises Services from Outside the Organization Using Service Bus Relay Service Bus Relay provides the communication infrastructure that enables you to expose a service to the Internet from behind your firewall or NAT router.Limitations of Azure Connect Azure Connect is intended for providing direct access to corporate resources. Guidelines for Using Service Bus Relay Service Bus Relay is ideal for enabling secure communications with a service running on-premises. Additionally. It is not suitable for connecting roles together. the Azure Connect Agent is not designed to operate with other operating systems. It provides a general purpose solution. Service Bus Relay is a more appropriate technology. although it does not require any IPv6 infrastructure on the underlying platform. Azure queues. and for establishing peer-to-peer connectivity. If you are using Azure Connect to join roles into your Windows domain. participating computers must be defined as part of the same virtual network and users must be authenticated and authorized using ACLs and Windows integrated security. either located on-premises or in the cloud. older legacy applications may have been constructed based on IPv4. this same machine must also implement the Windows DNS service. Azure Connect is not suitable for sharing resources with public clients. such as customers communicating with your services across the Internet. enabling them to communicate safely without requiring a complex security configuration or custom messaging infrastructure. The Service Bus Relay service provides an easy-to-use mechanism for connecting applications and services running on either side of the corporate firewall. Using Service Bus Relay brings a number of benefits:  It is fully compatible with Windows Communication Foundation (WCF). if you need to share resources between roles. a preferable solution is to use Azure storage. However.

Service Bus Relay is based on common standards such as HTTPS and TCP/SSL for sending and receiving messages securely. Your service is built using WCF and the Windows Azure SDK. It is interoperable with non-Windows platforms and technologies. and multicast eventing. including two-way request/response processing. It uses the WCF bindings provided by the Microsoft. This is the primary scenario for using Service Bus Relay. Client applications that have the appropriate authorization can query this registry to discover these services and download the metadata necessary to connect to these services and exercise the operations that they expose. The identities of users and applications accessing an on-premises service through Service Bus Relay do not have to be members of your organization’s security domain. Responses from the on-premises services are routed back through the same communications channel through the Service Bus Relay service to the client application. Client applications in the cloud also use the same Azure SDK and WCF bindings to connect to the service and send requests through the Service Bus Relay service. and it exposes an HTTP REST interface. an administrator can configure the on-premises service to start automatically so that it registers a connection with Service Bus Relay and is available to receive requests from client applications. It supports naming and discovery. it load balances the requests across these listeners. Service Bus Relay maintains a registry of active services within your organization’s namespace. It supports many common messaging patterns. Your service is built using WCF. . When using IIS. You can also build services using the HTTP REST interface. When the Service Bus receives requests directed towards an endpoint. any technology that can consume and produce HTTP REST requests can connect to a service that uses Service Bus Relay. The registry is managed by Windows Azure and exploits the scalability and availability that Azure provides.      You should consider using Service Bus Relay in the following scenarios:  An application running in the cloud requires controlled access to your service hosted on-premises. It does not require you to open any inbound ports in your organization’s firewall.ServiceBus assembly to open an outbound communication channel through the firewall to the Service Bus Relay service and wait for incoming requests. On-premises services can be hosted in a custom application or by using Internet Information Services (IIS). Service Bus Relay only uses outbound connections. one-way messaging. service remoting. You can open up to 25 listeners on the same endpoint. It supports federated security to authenticate requests. It supports load balancing.

Consequently the algorithm tends to have a very high success rate for home and small business scenarios with a small number of clients but degrades in its success rate with larger NATs. The NAT traversal algorithm is dependent on a very narrowly timed coordination and a best-guess prediction about the expected NAT behavior. The direct socket connection algorithm uses only outbound connections for firewall traversal and relies on a mutual port prediction algorithm for NAT traversal. Initially all messages are routed through the Service Bus Relay service. You can build a façade around your services that publishes them in Service Bus Relay.Figure 8 Routing requests and responses through Service Bus Relay This pattern is useful for providing remote access to existing on-premises services that were not originally accessible outside of your organization. If a direct connection can be established. but as an optimization mechanism a service exposing a TCP endpoint can use a direct connection. bypassing the Service Bus Relay service once the service and client have both been authenticated. the relayed connection is automatically upgraded to the direct connection. Applications external to your organization and that have the appropriate security rights can then connect to your services through Service Bus Relay. . The coordination of this direct connection is governed by the Service Bus Relay service. If the direct connection cannot be established. the connection will revert back to passing messages through the Service Bus Relay service.

The client application is built by using a technology other than WCF. Figure 10 Connecting to an on-premises service by using HTTP REST requests . In this scenario.Figure 9 Establishing a direct connection over TCP/SSL  An application running at a partner organization requires controlled access to your service hosted on-premises. a client application can use the HTTP REST interface exposed by the Service Bus Relay service to locate an on-premises service and send it requests.

. each message sent by a client application constitutes an event. All messages sent by the client are one-way. This approach is useful for building event-based systems. Your service is built by using a technology other than WCF. In this scenario. Figure 11 Connecting to an on-premises service built by using Ruby to the Service Bus Relay  An application running in the cloud or at a partner organization submits requests that can be sent to multiple services hosted on-premises. the services do not send a response. In this scenario. Effectively. and services can transparently subscribe to this event by registering with the Service Bus Relay service. the message from the client application is multicast to all on-premises services registered at the same endpoint with the Service Bus Relay service. a single request from a client application may be sent to and processed by more than one service running on-premises. the service can use the HTTP REST interface to connect to the Service Bus Relay service and await requests. You may have existing services that you have implemented using a technology such as Perl or Ruby. An application running in the cloud or at a partner organization requires controlled access to your service hosted on-premises.

but Topics and Subscriptions provide more flexibility. relocating it to the cloud and publishing it through Service Bus Relay enables you to retain the same endpoint details. Service Bus Topics and Subscriptions are described in Chapter 7. Service Bus Relay is very lightweight and efficient. an application running on-premises or at a partner organization can access a WCF service implemented as a worker role directly. so client applications do not have to be modified or reconfigured in any way. “Implementing Business Logic and Message Routing. Your service is implemented as an Azure worker role and is built using WCF. but the service has now been migrated to Azure. . In many situations.Figure 12 Multicasting using the Service Bus Relay You can also implement an eventing system by using Service Bus Topics and Subscriptions. without the intervention of Service Bus Relay. this scenario is valuable if the WCF service was previously located on-premises and code running elsewhere connected via Service Bus Relay as described in the preceding examples. Refactoring the service. they just carry on running as before. However. This scenario is the converse of the situation described in the previous cases.”  An application running on-premises or at a partner organization requires controlled access to your service hosted in the cloud.

an administrator might register URIs following the naming convention shown in this list:       sb://treyresearch. you should adopt a standardized convention for naming the endpoints for these sb://treyresearch. if Trey Research had sites in Chicago.servicebus. This will help you However. and sb:// and prepend the city and service name elements of the URI with this GUID. the URIs for the Chicago For example.Figure 13 Routing requests to a worker role through Service Bus Relay Guidelines for Naming Services in Service Bus Relay If you have a large number of services. protect.servicebus. In the Trey Research example . New York. To avoid problems such as this. when you register the URI for a service with Service Bus Relay. they could not register it by using a URI such as sb:// sb://treyresearch. Many organizations commonly adopt a hierarchical approach. You can generate a new GUID for each sb://treyresearch. What this means that if in the future Trey Research decided to implement an additional orders service for exclusive sb://treyresearch. each of which provided ordering and shipping you should ensure that the initial part of each URI is unique. including the exclusive orders service. could be: no other service can listen on any URI scoped at a lower level than your service. and monitor services and the client applications that connect to them.

such as Denial of Service to disrupt your and it determines the URI that your service exposes. and are themselves valuable assets. applications connect to Service Bus Relay across the For example. the realm of the relying party application with which you associate authenticated identities is the URL of the service endpoint on which the service accepts requests. This namespace must be the full URI of the service is sb://treyresearch.servicebus. if you create a namespace with the value TreyResearch and you publish a service named OrdersService in this see “AppFabric Service Bus – Things You Should Know – Part 1 of 3 (Naming Your Endpoints)” at http://windowsazurecat. You can also implement federated security through ACS to authenticate requests against a security service running on-premises or at a partner organization. The services that you expose through Service Bus Relay can provide access to sensitive chicago/shippingservice For more information about naming guidelines for Service Bus relay services.servicebus. chicago/ordersservice sb://treyresearch.  You should restrict access to your Service Bus namespace to authenticated services and client applications only. you should protect your Service Bus namespaces and the services that use it as carefully as you would defend your on-premises assets. “Authenticating Users and Authorizing Requests”. This requires that each service and client application runs with an associated identity that the Service Bus Relay service can verify. As described in the section “Windows Azure Service Bus Authentication and Authorization” in chapter 5. or Man-in-the-Middle to steal data as it is passed to your services. Unauthorized applications that can connect to your Service Bus namespaces can implement common attacks. Service Bus includes its own identity provider as part of the Azure Access Control Service (ACS). . therefore you should protect these services. When you create a new service that communicates with client applications by using Service Bus Relay you can generate a new service namespace. When you configure access to a service exposed through Service Bus Relay. There are several facets to this task: Remember that even though Service Bus is managed and maintained by one or more Microsoft chicago/ordersservice/exclusive sb://treyresearch. client applications specify this URI to connect to your service through Service Bus Relay.   sb://treyresearch. either by using the Azure Management Portal or programmatically. Guidelines for Securing Service Bus Relay Service Bus Relay endpoints are organized by using Service Bus namespaces. and you can define identities and keys for each service and user running a client application.

most client applications should only be able to send messages to a service (and obtain a response) while services should be only able to listen for incoming and Yahoo!. These credentials are validated. For more details.servicebus. The authorization process is governed by the way in which you implement the service and is outside the scope of ACS. A client application can take a similar approach. you should consider building a Windows Identity Foundation authorization provider to decouple the authorization rules from the business logic of your service. and Manage. If you are using WCF to implement your services. see “Securing and Authenticating an AppFabric Service Bus Connection” on MSDN at For more information about protecting services through Service Bus Relay. Service Bus defines the claim type net. you should carefully control the operations that the client application can invoke. although you can use ACS to generate the claimset for an authenticated client. ACS itself authenticates the name and key. When you specify this provider.Service Bus can also use third party identity providers. For example. A common approach used by many services is to define an endpoint behavior that references the transportClientEndpointBehavior element in the configuration file. When a service starts running and attempts to advertise an endpoint. Note that using the shared secret token provider is just one way of supplying the credentials for the service and the client application. and are used to determine whether the service should be allowed to create the endpoint. refer back to Chapter 5. except that it specifies the name and symmetric key for the identity running the application rather than that of the service. . it provides its credentials to Service Bus Relay service. such as Windows Live ID. which are passed to your service for authorization.action which has the possible values Send. Other authentication provider options are available. Using ACS you can implement a rule group for each URI defined in your Service Bus namespace that maps an authenticated identity to one or more of these claim values. Google. These claims determine whether the service or client application has the appropriate rights to listen for.aspx. but the default is to use the built-in identity provider included within ACS. or send messages. You can also specify a different Security Token Service (STS) other than that provided by ACS to authenticate credentials and issue claims. as determined by the rules configured in ACS. including using SAML tokens and Simple Web Tokens directly. and if they are valid it generates a Simple Web Token (SWT) containing the claims for this identity.  You should limit the way in which clients can interact with the endpoints published through your Service Bus Listen. This element has a clientCredentials element that enables a service to specify the name of an identity and the corresponding symmetric key to verify this identity.  When a client application has established a connection to a service through Service Bus Relay.

52. 111.97.64. but the following list shows the addresses for each region at the time of writing:    Asia (SouthEast): 207. to provide access to Service Bus or ACS you must add the addresses of the corresponding Azure services to your firewall. In this configuration.0/20 Asia (East): 111. You should protect these communications by using an appropriate level of data transfer security. All communications passing between your service and client applications is likely to pass across a public network or the Internet.0/21.128. Figure 14 Recommendations for protecting services exposed through Service Bus Relay Many organizations implement outbound firewall rules that are based on IP address allow-listing.160. Figure 14 illustrates the core recommendations for protecting services exposed through Service Bus Relay. 111.80.221. implementing transport security is really just a matter of selecting the most appropriate WCF binding and then setting the relevant properties to specify how to encrypt and protect data.0/19 Europe (West): 94. and they may also change over time. These addresses vary according to the region hosting the services. 65. If you are using WCF to implement your services.0/19 .0/20.0/24.48.245. such as SSL or HTTPS. 65. and transaction flow. you can use the same types and APIs that you are familiar with in the System. you do not need to open any additional inbound ports in your corporate firewall.0/21.55. 94. If you are using WCF to implement your services.160.64. Once a service has registered with the Service Bus Relay service. reliable sessions.245.0/20.160. For example.37.88.   Europe (North): 213.ServiceModel assembly.52.37. and they can operate over HTTP and HTTPS. WS2007HttpBinding.0/21.52. 65. 65. NetTcpRelayBinding.199. 65.48.0/20.0/ 65. 94.192.104. 94. The Azure SDK includes transport bindings.32.0/23 US (South/Central): 65.0/20.0/20. much of the complexity associated with protecting the service and authenticating and authorizing requests can be handled transparently outside the scope of the business logic of the service.55.52. IP address allow-listing is not really a suitable security strategy for an organization when the target addresses identify a massively multi-tenant infrastructure such as Windows Azure (or any other public cloud platform.224.112.0/20. 70. selecting an appropriate binding for a service that uses Service Bus Relay has an impact on the connectivity for client applications and the functionality and security that the transport provides.199. . you should find Service Bus Relay quite straightforward. WSHttpRelayBinding.  TCP binding.54. 70. WSHttpBinding. If you are familiar with building services and client applications using WCF.0/20.0/21.48.80. for that matter).0/19.ServiceBus assembly for integrating a WCF service with Service Bus Relay. 65.240. Selecting a Binding for a Service The purpose of Service Bus Relay is to provide a safe and reliable connection to your services running onpremises for client applications executing on the other side of your corporate firewall. They offer the same connectivity and feature set as their WCF counterparts.0/20.0/19. 70. and WebHttpBinding) except that they are designed to extend the underlying WCF channel infrastructure and route messages through the Service Bus Relay service. the WS2007HttpRelayBinding binding supports SOAP message-level security. 213.0/21.64. These bindings are very similar to their standard WCF equivalents (BasicHttpBinding.0/20. The Microsoft. As with a regular WCF service.128.0/21. 65. These bindings open a channel to the Service Bus Relay service by using outbound connections only.48.ServiceBus assembly provides four sets of bindings:  HTTP bindings. 209.46. and other extensions in the Microsoft.0/19 US (North/Central): 207. WS2007HttpRelayBinding and WebHttpRelayBinding. BasicHttpRelayBinding.192. behaviors. 213.

For example.  Multicasting binding. ensuring an orderly throughput without swamping the service. Finally. The network scheme used for addresses advertised through the NetTcpRelayBinding binding is sb rather than the net. the address of the Orders service implemented by Trey Research could be sb://treyresearch. It supports duplex callbacks and offers better performance than the HTTP bindings although it is less portable. the NAT prediction algorithm that establishes the direct connection between the service and client application requires that you also open outbound TCP ports 818 and 819 in your corporate firewall. As with the NetOnewayRelayBinding binding. N client applications can connect to M services. This binding is suitable for implementing a service that provides asynchronous operations as they can be queued and scheduled by the Service Bus Relay service. This binding also supports the hybrid mode through the ConnectionMode property (the HTTP bindings do not support this type of connection). This binding implements one-way buffered messaging.This binding is functionally equivalent to the NetTcpBinding binding of WCF. and message delivery and order is not guaranteed. . client applications connecting to a service using this binding may be required to send requests and receive responses using the TCP binary encoding. but you should consider setting it to Hybrid if you want to take advantage of the performance improvements that bypassing the Service Bus Relay service provides. Similarly. NetEventRelayBinding. note that the hybrid connection mode requires that the binding is configured to implement message-level security. it does necessitate that you open outbound TCP port 808. NetOnewayRelayBinding. Client applications can connect using either the NetEventRelayBinding binding or NetOnewayRelayBinding binding. and port 828 if you are using SSL. the order in which messages submitted by a client application are passed to the service is not guaranteed either. This binding is ideal for building an eventing system. However. The default connection mode for this binding is Relayed. so it requires outbound ports 808 and 828 (for SSL) to be open. This binding is a specialized version of the NetOnewayRelayBinding binding that enables multiple services to register the same endpoint with the Service Bus Relay service. Although this binding does not require you to open any additional inbound ports in your corporate firewall. so it requires outbound ports 808 and 828 (for SSL) to be open in your firewall. A client application sends requests to a buffer managed by the Service Bus Relay service which delivers the message to the  One-way binding. with the Service Bus Relay service effectively acting as the event hub. However message delivery is not guaranteed. depending on how the binding is configured by the service. if the service shuts down before the Service Bus Relay service has forwarded messages to it then these messages will be lost.tcp scheme used by the WCF NetTcpBinding All communications are one-way. this binding uses a TCP connection for the service. This binding uses a TCP connection for the service.

a service must be running before a client application can connect to it otherwise the client application will receive an EndpointNotFoundException exception. but if a client application cannot reach it because of a network failure the client will again receive an EndpointNotFoundException exception. they do not have to provide an identity that is defined within your organization’s corporate Windows domain. transactional messaging with guaranteed delivery. Service Bus Queues implement reliable. as long as a client application can post a message to a queue it will be delivered when the service is next able to connect to the queue. a service may be running.  Service Bus Relay maintains a registry of publicly accessible services within an Azure namespace. This limitation applies even with the NetOnewayRelayBinding and NetEventRelayBinding bindings. These services can act as façades that implement highly controlled and selective access to the wrapped resources. . when deciding which of these technologies you should use. Azure Connect is intended to provide direct access to resources that are not exposed publicly. All Azure Connect requests pass through the Service Bus Relay service. Service Bus Queues enable you to decouple services from the client applications use them. Additionally. consider the following points:  Service Bus Relay can provide access to services that wrap on-premises resources. bypassing the Service Bus Relay service once an initial exchange has occurred. so messages are never inadvertently lost. but all client applications using these resources must be provisioned with an identity that is defined within your organization’s corporate Windows domain. You protect these resources by defining ACLs. you cannot make direct connections to resources (although the way in which Azure Connect uses the Service Bus Relay service is transparent). Azure Connect does not support enumeration of resources. This mechanism supports dynamic client applications that discover services at runtime.  Client applications communicating with a service through Service Bus Relay can establish a direct connection. A client application with the appropriate security rights can query this registry and obtain a list of services and their metadata (if published). a client application cannot easily discover resources at runtime. Implementing a Reliable Communications Channel across Boundaries Using Service Bus Queues Service Bus Relay imposes a temporal dependency between the services running on-premises and the client applications that connect to them. both in terms of functionality (a client application does not have to implement any specific interface or proxy to send messages to a service) and time (a service does not have to be running when a client application posts it a message). However. and use this information to connect to the service and invoke its operations.Service Bus Relay and Azure Connect Compared There is some overlap in the features provided by Service Bus Relay and Azure Connect. Additionally. Service Bus Relay is heavily dependent on the reliability of the network. Client applications making requests can be authenticated with a service by using ACS and federated security. Service Bus Queues are resilient to network failure.

and a collection of message properties which can be used to add metadata to message. indicating the expiration time for a message if it is undelivered. The data in a message must be serializable. The message body is opaque to the Service Bus Queue infrastructure and it can contain any applicationdefined information. the message body which contains the information being sent. By default the BrokeredMessage class uses a DataContractSerializer object with a binary XmlDictionaryWriter to perform this task. an application can define custom properties and add them to the metadata. Some of the metadata items define standard messaging properties which an application can set and that are used by the Service Bus Queues infrastructure for performing tasks such as uniquely identifying a message. and many other common operations. not even in the Azure Management Portal.” Guidelines for Using Service Bus Queues Service Bus Queues are perfect for implementing a system based on asynchronous messaging. These items are typically used to pass additional information describing the contents of the message. specifying the session for a message. Additionally. In contrast. Azure provided Message Buffers. such as the size of the message and the number of times a receiver has retrieved the message in PeekLock mode but not completed the operation successfully. . the Service Bus Queue infrastructure can examine the metadata of a message. The descriptions in this section therefore refer to “senders” and “receivers” rather than client applications and services. but they are only included for backwards compatibility. as long as this data can be serialized. Prior to the availability of Service Bus Queues. keep in mind that client applications and services can both send and receive messages. This SDK includes APIs for interacting directly with the Service Bus Queues object model. although you can override this behavior and provide your own XmlObjectSerializer object if you need to customize the way which the data is serialized. Service Bus Subscriptions are described in more detail in Chapter 7. If you are implementing a new system. It consists of two elements. These are still available. The message body may also be encrypted for additional security. and they can also be used by Service Bus to filter and route messages to message subscribers. Messages also expose a number of system-managed read-only properties. but it also provides bindings that enable WCF applications and services to connect to queues in a similar way to consuming MSMQ queues in an enterprise environment. You can build applications and services that utilize Service Bus Queues by using the Azure SDK. Service Bus Messages A Service Bus message is an instance of the BrokeredMessge class.When you are dealing with message queues. The contents of the message are never visible outside of a sending or receiving application. “Implementing Business Logic and Message Routing. The body of a message can also be a stream. you should use Service Bus Queues instead.

and by default messages will be received in the order in which they are sent. The sender and receiver are independent. unlike the situation when using Service Bus Relay. even if the service is not running when the sender posts the message. For example. A sender posts a message to a queue and at some later point in time a receiver retrieves the message from the same queue. they do not have to be running concurrently. the on-premises application is built using the Java programming language and is running on Linux. can all interact with Service Bus Queues. . so it performs HTTP REST requests to post messages.Service Bus Queues enable a variety of common patterns and can assist you in building highly elastic solutions as described in the following scenarios:  A sender needs to post one or more messages to a receiver. This is the most common scenario for using a Service Bus Queue and is the archetypal model for implementing asynchronous processing. reliably storing messages for subsequent processing. The worker role is a WCF service built using the Azure SDK and the Service Bus Queue APIs. and applications and services built using technologies not supported by the Azure SDK. the receiver may be temporarily offline for maintenance. Figure 15 shows an example architecture where the sender is an on-premises application and the receiver is a worker role running in the cloud. both the sender and the receiver can be located elsewhere. A Service Bus Queue is an inherently first-in-first-out (FIFO) structure. An important point to note is that although Service Bus Queues reside in the cloud. In this example. The queue effectively acts as a buffer. Messages should be delivered in order and message delivery must be guaranteed. a sender could be an on-premises application and a receiver could be a service running in a partner organization. For example. they may be executing remotely from each other and. The Service Bus Queue APIs in the Azure SDK are actually wrappers around a series of HTTP REST interfaces. Applications and services build using the Azure SDK.

and you need to optimize the time at which it executes so that you do not impact your core business operations. At some scheduled time you can start the receiver application running to retrieve and handle each message in turn. The receiving application runs on-premises. In this case. using a Service Bus Queue  Multiple senders running at partner organizations or in the cloud need to send messages to your system. To accomplish this style of processing. Service Bus Queues are highly suitable for batch-processing scenarios where the message-processing logic runs on-premises and may consume considerable resources. . These messages may require complex processing when they are received. senders can post requests to a Service Bus Queue while the receiver is offline. you may wish to perform message processing at off-peak hours so as to avoid a detrimental effect on the performance of your critical business components. in order.Figure 15 Sending and receiving messages. When the receiver has drained the queue it can shut down and release any resources it was using.

and responses. . the Service Bus Queue acts as a load-leveling technology. A message queue is an inherently asynchronous one-way communication mechanism. you could restructure the service to use a Service Bus Queue. In this case. In implementing a solution. but listen for the response on their own specific queues. Although this is a straightforward mechanism in the simple case with a single sender posting a single request to a single receiver which replies with a single response. leading to poor performance and failures caused by client applications being unable to connect. in a more realistic scenario there will be multiple senders. A sender can populate the ReplyTo metadata property of a request message with a value that indicates which queue the receiver should use to post the response. smoothing the workload of the receiver while not blocking the senders. you have to address two problems: ◦ ◦ How can you prevent a response message being received by the wrong sender? If a sender can post multiple request messages. where an unexpectedly large number of client applications suddenly post a high volume of requests to a service running on-premises.Figure 16 Implementing batch processing by using a Service Bus Queue You can use a similar solution to address the “fan-in” problem. All senders post messages to the receiver using the same queue. receivers. If a sender posting a request message expects a response. In this scenario. All Service Bus messages have a collection of properties that you use to include metadata.  A sender posting request messages to a queue expects a response to these requests. and the service processes them in its own time. Client applications post messages to this queue. the receiver can post this response to a queue and the sender can receive the response from this queue. If the service attempts to process these requests synchronously it could easily be swamped. how does the sender know which response message corresponds to which request? The answer to the first question is to create an additional queue that is specific to each sender. requests.

the . when a sender posts a message to a queue. “Deploying Functionality and Data to the Cloud”. although Service Bus Queues are themselves inherently reliable. If no correlating response appears after a specified time interval. The CorrelationId property of the acknowledgement message should match the MessageId property of the original request. To handle this situation. the sender may post a duplicate message that the receiver has already processed. As a simple example. The second problem can be handled by the sender setting the CorrelationId property of the response message to the value held in the MessageId of the original request. It may therefore be necessary not only to implement retry logic to handle the transient network errors that might occur when posting or receiving a message. As discussed in Chapter 4. Figure 17 Implementing two-way messaging with response queues and message correlation  You require a reliable communications channel between a sender and a receiver. the connection to them across the network might not be. However. set by the sender. In this way. This process can repeat until either the sender receives an acknowledgement. but also to incorporate a simple end-to-end protocol between the sender and receiver in the form of acknowledgement messages. it is possible that the receiver has retrieved the message posted by the sender and has acknowledged it.All messages should have a unique MessageId value. the sender can repost the message and wait again. it can wait (using an asynchronous task or background thread) for the receiver to respond with an acknowledgement. and neither might the applications sending or receiving messages. or a specified number of iterations has occurred in which case the sender gives up and handles the situation as a failure to send the message. the sender can determine which response relates to which original request. In this case. You can extend the message-correlation approach if you need to implement a reliable communications channel based on Service Bus Queues. but this acknowledgement has failed to reach the sender.

Service Bus Queues are a good solution for implementing load-leveling. preventing a receiver from being overwhelmed by a sudden influx of messages. it may be important for a message to be processed within a short period of time. the sender assumes that the receiver has not received or processed the message while the receiver is not aware that the sender has aborted. the semantics of message queuing prevents the same message from being dequeued by two concurrent receivers. Senders do not have to be modified in any way as they continue to post messages to the queue in the same manner as before. possibly causing the system to enter an inconsistent state. If you enable duplicate detection. Figure 18 Implementing load-balancing with multiple receivers . If it receives another message with the same ID. if a receiver reposts an acknowledgement message.receiver should maintain a list of message IDs for messages that it has handled. Important: Do not use the duplicate detection feature of Service Bus Queues to eliminate duplicate messages in this scenario.  Your system experiences spikes in the number of messages posted by senders and needs to handle a highly variable volume of messages in a timely manner. duplicate detection may cause this reposted message to be removed and the sender will eventually abort. as shown in figure 18. it should simply reply with an acknowledgement message but not attempt to repeat the processing associated with the message. In some situations. This solution even works for implementing two-way messaging. the solution is to add further receivers listening for messages on the same queue. This “fan-out” architecture naturally balances the load amongst the receivers as they compete for messages from the queue. For example. and dynamically start and stop receiver instances as the queue grows or drains. In this case. A monitor process can query the length of the queue. repeated request or acknowledgement messages may be silently removed causing the end-to-end protocol to fail. However. this approach is only useful if the messages being sent are not time sensitive.

The same message receiver that handled the first message may be required to process the subsequent messages in the series. a message might convey state information that sets the context for a set of following messages. see the section “Managing Elasticity in the Cloud by Using the Enterprise Library AutoScale Application Block” in chapter 9. Another receiver calling AcceptMessageSession will receive messages from the next available session. as necessary. Service Bus Queues can be configured to support sessions. These messages will need to be retrieved by the same receiver and then reassembled into the original large message for processing. a sender might need to send a message that is bigger than the maximum message size (currently 256Kb). “Maximizing Scalability. in which case all of these messages are treated as belonging to the same session. For more information. the AcceptMessageSession method effectively pins the messages in the queue that comprise the session to the receiver and hides them from all other receivers. The messages have dependencies on each other and must be processed by the same receiver. each sender generates its own session. Notice that it is possible for multiple senders to post messages with the same session id. To help reduce the threat of such an attack.  A sender posts a series of messages to a queue. A receiver willing to handle the messages in the session calls the AcceptMessageSession method of a QueueClient object. A session consists of a set of messages that comprise a single conversation. For more information. in much the same way as retrieving the messages from an ordinary queue. and Performance. If multiple receivers are listening to the queue. To address this problem.” You should bear in mind that a sudden influx of a large number of requests might be the result of a Denial of Service (DoS) attack.You can use the Enterprise Library Auto-Scale Application Block to monitor the length of a Service Bus Queue and start or stop worker roles acting as receivers. it is important to protect the Service Bus Queues that your application uses to prevent unauthorized senders from posting messages. All messages posted to this queue must have their SessionId property set to a string value. . the same receiver must handle all the messages in the series. This method establishes a session object that the receiver can use to retrieve the messages for the session. and all messages with the same SessionId value are considered part of the same session. Figure 19 shows two senders posting messages using distinct session ids. see the section “Service Bus Queues Security Model” later in this chapter. the sender can divide the data into multiple smaller messages. All messages in a session must be handled by the same receiver. Availability. In this scenario. In a related case. The value stored in this string identifies the session. The receivers for each session only handle the messages posted to that session. However. You indicate that a Service Bus Queue supports sessions by setting the RequiresSession property of the queue to true when it is created.

You achieve this by setting the ReplyToSessionId property of a response message with the value in the SessionId property of the received messages before replying to the sender. when a receiver retrieves a message from a session. A message session can include session state information. Figure 20 illustrates this scenario. If the receiver should fail or terminate unexpectedly. another instance can connect to the session. You can use this information to track the work performed during the message session and implement a simple finite state machine in the message receiver. . it can store information about the work performed while processing the message in the session state and write this state information back to the session in the queue. and continue the necessary processing where the previous failed receiver left off. The sender can then establish its own session and use the session ID to correlate the messages in the response session with the original requests.Figure 19 Using sessions to group messages You can also establish duplex sessions if the receiver needs to send a set of messages as a reply. Use the GetState and SetState methods of a MessageSession object to retrieve and update the state information for a message session. stored with the messages in the queue. retrieve the session state information.

com/en-us/library/system.  A sender needs to post one or more messages to a queue as a singleton operation. If some part of the operation fails. if you are incorporating operations from other transactional sources. To post a batch of messages as part of the same transaction you should invoke the send operation for each message within the context of the same TransactionScope object. In effect. the message receiver should be prepared to hibernate itself if it is inactive for a predefined duration. When a new message appears for the session. then these operations cannot be performed within the context of the same TransactionScope object. see the topic “TransactionScope Class” on MSDN at http://msdn. none of the messages are sent but are instead removed from the queue. you must complete the transaction successfully.aspx. The simplest way to implement a singleton operation that posts multiple messages is by using a local transaction. To ensure that all the messages are actually sent. the resource manager that wraps Service Bus Queues cannot share transactions with other resource managers. If you are sending messages asynchronously (the recommended approach).Figure 20 Retrieving and storing message session state information It is possible for a session to be unbounded—there might be a continuous stream of messages posted to a session at varying and sometimes lengthy intervals. then none of the messages should be sent and they must all be removed from the queue. this is a programmatic construct that defines the scope for a set of tasks that comprise a single transaction.transactions.transactionscope. In this You initiate a local transaction by creating a TransactionScope object. . such as a SQL Server database. For more information about the TransactionScope class. the messages are simply buffered and are not actually sent until the transaction completes. another process monitoring the system can reawaken the hibernated receiver which can then resume processing. it may not feasible to send messages within the context of a TransactionScope object. Additionally. If the transaction fails.

The ReceiveAndDelete receive mode provides better performance than the PeekLock receive mode. both messages are assumed to be identical and duplicate detection will remove the second message. It is not possible to change the value of this property on a queue that already exists. if the . but rather they are locked to prevent them from being retrieved by another concurrent receiver which will retrieve the next available unlocked message. again as part of a transactional operation.If you attempt to use a TransactionScope object to perform local transactions that enlist a Service Bus Queue and other resource managers. In PeekLock mode. If the receiver successfully completes any processing associated with the message it can call the Complete method of the message which removes the message from the queue. in the event of failure in its business logic. retry logic. messages are not removed from the queue as they are read. In these scenarios. A message receiver can retrieve messages from a Service Bus Queue by using one of two receive modes. a sender application can simply attempt to resend a message as part of its failure/retry processing. you can implement a custom-pseudo transactional mechanism based on manual failure detection. If the receiver is unable to handle the message successfully. Additionally. Using this feature.  A receiver retrieves one or more messages from a queue. your code will throw an exception. If the transaction fails. and the duplicate message elimination (“de-dupe”) of Service Bus Queues. If two messages are posted to the same queue with the same message ID. then all messages must be replaced into the queue to enable them to be read again. it can call the Abandon method which releases the lock but leaves the message on the queue. you should set the DuplicateDetectionHistoryTimeWindow property to a TimeSpan value that indicates the period during which duplicate messages for a given message ID are discarded. but PeekLock provides a greater degree of safety. see the section “Retry Logic and Transactional Messaging” in chapter 4. each message that a sender posts to a queue should have a unique message ID. If the message had been successfully posted previously the duplicate will be eliminated. You enable duplication detection by setting the RequiresDuplicateDetection property of the queue to true when you create it. In ReceiveAndDelete mode. the receiver will only see the first message. if a new message with an identical message ID appears when this period has expired then it will be queued for delivery. This approach guarantees that the message will always be sent at least once (assuming that the sender has a working connection to the queue) but it is not possible to easily withdraw the message if the failure processing in the business logic determines that the message should not be sent at all. This approach is suitable for performing asynchronous receive operations. messages are removed from the queue as soon as they are read. To use de-dupe. In ReceiveAndDelete mode. For guidance on how to implement custom transactional messaging. ReceiveAndDelete and PeekLock.

all messages received and processed within the context of the TransactionScope object will be replaced on the queue. copy the message into a local buffer and examine it. If the message is not intended for this receiver. In PeekLock mode. The default receive mode for a Service Bus Queue is PeekLock. if the receive operation is not successfully completed and the message processed. so any receiver can retrieve the message. . If a message contents are confidential and should only be read by a specific receiver. provided that the transaction does not attempt to enlist any additional resource managers. the receiver can retrieve the message by using the PeekLock receive mode. it can quickly call the Abandon method of the message to make it available to another receiver. Message properties are not encrypted. the sender can encrypt the message body with a key that only the correct receiver knows. The sender can also populate the To property of the message with an identifier that specifies the correct receiver.  A receiver needs to examine the next message on the queue but should only dequeue the message if it is intended for the receiver. As with message send operations. In this scenario. it is abandoned and returned to the queue from where it can be read again. If the transaction does not complete successfully. Only PeekLock mode respects local transactions. ReceiveAndDelete mode does not. so it can abandon the message and leave it for the correct receiver. a message receive operation performed using PeekLock mode can be also performed synchronously as part of a local transaction by defining a TransactionScope object.receiver fails after reading the message then the message will be lost. but if it does not recognize the address in the To property it will probably not have the appropriate key to decrypt the message contents.

these operations are synchronous.NET Framework.aspx. The MessageSender class exposes an asynchronous version of the Send method. “Implementing Business Logic and Message Routing. For more information and good practice for following this approach. If you are building applications that connect to Service Bus Queues by using a technology that does not support the Azure SDK you can use the HTTP REST interface exposed by Service Bus.aspx.Messaging namespace. You should use these methods in preference to their synchronous counterparts. However. and your application should not block waiting for these operations to finish. and similarly the Receive method of the MessageReceiver class either waits for a message to be available or until a specified timeout period has expired. The basic functionality for sending and receiving messages is available through the Send and Receive methods of these types. your applications should assume that:  Send and receive operations may take an arbitrarily long time to complete. However. Remember that these methods are really just façades in front of a series of HTTP REST requests. For example. using subscriptions can become unwieldy and difficult to manage if there are a large or variable number of see “How to: Build an Application with WCF and Service Bus Queues” at http://msdn. and that the Service Bus Queue is a remote service being accessed over the Internet. see “Best Practices for Leveraging Windows Azure Service Bus Brokered Messaging API” on MSDN at http://msdn. For more information about creating Service Bus Topics and Subscriptions.Figure 21 Using PeekLock with encryption to examine messages without dequeueing As an alternative approach to implementing this you can implement applications that send and receive messages by using the MessageSender and MessageReceiver classes in the Microsoft. see “Service Bus REST API Reference” at you could consider using a Service Bus Topic with a separate Service Bus Subscription for each These methods follow the standard asynchronous pattern implemented throughout the . You can use the Service Bus Queue bindings to connect to a queue from WCF client applications and services. Therefore. For more information about . For more information.103). These types expose the messaging operations described earlier in this chapter and in chapter 4. For more information.” Guidelines for Sending and Receiving Messages Using Service Bus Queues You can implement the application logic that sends and receives messages using a variety of technologies:  You can use the Service Bus Queue APIs in the Azure SDK. see Chapter 7. the Send method of the MessageSender class waits for the send operation to complete before continuing.   Sending and Receiving Messages Asynchronously If you are using the Azure SDK. and a MessageReceiver class provides an asynchronous implementation of the Receive method through the BeginSend/EndSend and BeginReceive/EndReceive method pairs

If a receiver expects to handle multiple messages as part of a business operation. Application code that catches this event can then process the message. both described in chapter 7.Messaging namespace to connect to a queue and send and receive messages.  A sender can post messages at any time. For example. subsequent messages are returned much more quickly. There are many possible solutions to this problem. and triggering an event when a message is available. connecting to a queue. you should perform these operations following the same robust. The QueueClient type is an abstract class that implements a superset of the functionality available in the MessageSender class MessageReceiver classes. you can switch to using Topics without requiring that you modify your code. Subsequent calls to Receive retrieve the remaining messages from the buffer until it is empty. Before you modify all of your existing code to use MessageSender and MessageReceiver objects. Apart from Queues. However. You can then use a framework such as the Microsoft Reactive Extensions to catch these events and process messages as they become Topics and Subscriptions. You can however set the PrefetchCount property of the QueueClient object to a positive integer value. and querying the length of a queue. when the next Receive operation will fetch the next batch of messages from the queue and buffer them. You can also use the QueueClient class in the Microsoft. and SubscriptionClient types is available in the MessageSender and MessageReceiver classes. you can define an async method that makes use of the await operator available with the Visual Studio Async extensions to create a series of tasks that wait for a message on each possible queue and raise an event. such as determining whether a queue exists. creating and deleting a queue. However. and the Receive method will actually pull the specified number of messages from the queue (if they are available). the MessageSender and MessageReceiver classes abstract the differences between these types. The Receive method then returns the first message from this buffer. but the most common approaches involve using a thread or task to wait for messages on each queue. Service Bus implements other related messaging. when a QueueClient object performs the Receive method only the next available message is taken from the queue. For example. The messages are buffered locally with the receiving application and are no longer available to other receivers. the MessageReceiver class does not support sessions. Similarly. be aware that not all of the functionality implemented by the QueueClient. By default. Therefore. The same issues arise with other operations. a MessageReceiver object enables you to retrieve messages from a Queue and a Subscription by using the same code. The Azure SDK provides additional types for sending messages to Topics (TopicClient) and receiving messages from Subscriptions (SubscriptionClient).aspx?id=19957. For example. and a receiver may need to listen for messages on more than one queue. TopicClient. asynchronous approach. .ServiceBus.following this approach. see the document “The Task-based Asynchronous Pattern” at http://www. you can optimize the receiving process by using the prefetch functionality of the QueueClient if you use a MessageSender to send and receive messages using Queues. This approach makes more efficient use of the network bandwidth at the cost of lengthening the time taken to retrieve the first message.

set the ScheduledEnqueueTimeUtc property of the BrokeredMessage object. Therefore. You should consider using the Transient Fault-Handling Application Block for detecting and retrying operations that can fail due to transient errors. the data could be time sensitive and may not be released until after midnight. you should only buffer sufficient messages that the receiver can retrieve and process within this timeout period. while others may be more permanent. The Defer method leaves the . you can read it by using the GetBody method of the BrokeredMessage class. and so on. if the message has not been received then it should be silently removed from the queue. if you implement prefetching. This method deserializes the message body. for example). that message might wait in the queue for some considerable time before a receiver picks it up. for example. by default it is immediately available for a receiver to retrieve and process. an application may not want to process the next available message but skip over it. it will fail. retrieve subsequent messages. However. To specify the time when the message should appear on the queue and be available for processing. Scheduling. you should store this data in an appropriate object and use this object instead. To implement this mechanism. When you retrieve a message from a queue. Operations perform send and receive operations asynchronously must incorporate the appropriate cancellation handling to enable any background threads to be tidied up appropriately and messaging resources released. The section “Sending Messages Asynchronously” later in this chapter includes an example showing how Trey Research implemented transient fault handling when sending messages asynchronously. When a sender posts a message to a queue. using the Defer method of the BrokeredMessage class. a security violation caused by a change in the security implemented by the Service Bus Queue (an administrator might decide to revoke or modify the rights of an identity for some reason). In this case. This technique is useful for scheduling messages that should only be processed after a particular point in time. You can achieve this by setting the TimeToLive property of the BrokeredMessage object when the sender posts the message. if you anticipate requiring repeated access to the data in the message body. Some of these failures might the result of transient errors.Prefetched messages are subject to the same timeout semantics as unbuffered messages. In some situations. and Deferring Messages When a sender posts a message to a queue. described in chapter 4. You can only deserialize this data once.  Send and receive operations could fail for a variety of reasons. You can achieve this by deferring the message. if they are not processed within the timeout period starting from when they are fetched from the queue then the receiver is assumed to have failed and the messages are returned back to the queue. an application must retrieve messages by using PeekLock mode. Therefore. you can arrange for a message to remain invisible when it is first sent and only appear on the queue at a later. The message might have a lifetime after which it become stale and the information that it holds is no longer valid. and only return to the skipped message later. This is an important factor to bear in mind when designing your fault-handling logic. Expiring. such as a failure in connectivity between your application and the Service Bus in the cloud. if you attempt to call the GetBody method on the same message again (inside an exception handler. the queue being full (they have a finite size).

and then finish by calling the Complete or Abandon methods as described earlier in this chapter. An application instantiates a MessagingFactory object using this URI. You should note that the Send and Listen claims each confer a very minimal set of privileges. Consequently they are reliable and durable.action claim type values Send. as appropriate. but it is locked and unavailable to other receivers.message on the queue. or Subscription appended to the end (such as http://treyresearch. querying the number of messages currently posted to a queue. How Trey Research Implements Cross-Boundary Communications The Trey Research example application that is available in conjunction with this guide demonstrates many of the scenarios and implementations described in this chapter. earlier in this chapter. . They are created and managed by Windows Azure. and the URI structure is similar to that described in the section “Service Bus Relay Security Model”. If you have not already done rather than a WCF service. In the event that a message is no longer useful or valid at the time that it is processed. If your application needs to perform tasks such as creating a new queue. the lock eventually times out and the message becomes available in the queue. You establish this URI when you create the namespace. the application can return to the message to process it. You can specify the lock duration by setting the LockDuration property of the queue when it is if a message is somehow intercepted by a different rogue receiver it will not be able to examine the contents of the message. Note that if the application fails. enabling an application to post messages to a queue or retrieve messages from a queue respectively. If the valid receiver of a message is not able to decrypt that message. in a manner very similar to that described in the section “Guidelines for Securing Service Bus Relay” earlier in this chapter except that the realm of the relying party application is the URI of the Service Bus namespace with the name of the Service Bus Queue. in the cloud. encrypted by using SSL. The Service Bus namespace provides the protection context for a queue. In this way.servicebus. but very little else. At the appropriate juncture. or even simply determining whether a queue with a given name exists. All communications with a Service Bus Queue occur over a TCP channel. it should be treated as a poison message and moved to the dead letter queue as described in the section “Dead Letter Queues” in chapter 4. You protect namespaces by using ACS.servicebus. If you need to implement additional security at the message level. now would be a good time to download the example so that you can examine the code it contains as you read through this section of the guide. the application must run with an identity that has been granted the rights associated with the Manage claim. the application can optionally dead letter it. once a sender has posted a message to a queue it will remain on the queue until it has been retrieved by a receiver or it has A Service Bus Queue is held in a Service Bus namespace identified by a unique URI. You can create an ACS rule group for this URI and assign the net. and the namespace holding your queues should only be made available to authenticated senders and receivers. You can download the example code by following the links at http://wag. and Manage to authenticated identities. Guidelines for Securing Service Bus Queues Service Bus Queues provide a messaging infrastructure for business applications. The MessagingFactory object can then be used to create a MessageSender or MessageReceiver object that connects to the queue. you should encrypt the contents of messages and the receiver should be provided with the decryption key. Listen.

but the compliance application generates reports that are stored in a secure on-premises location within the Trey Research head office infrastructure.   The following sections describe how Trey Research uses these technologies to communicate between components across the boundary between the cloud and the infrastructure running on-premises. The VM role that hosts the compliance application examines data held in the SQL Azure Orders database. Reports are stored in a share protected by using ACLs. Trey Research implements a reliable communications mechanism and protocol based on Service Bus Queues. Connecting to the Compliance Application This section is provided for information only. The organization makes this data available to reporting applications hosted externally through a series of services exposed by using Service Bus Relay. The Trey Research example application does not actually include the compliance application or the corresponding VM role. Topics. The VM role is deployed to the US North Data Center. Access is granted to an account defined within the domain. and Subscriptions to transfer information about orders between the organization and the transport partners. Trey Research maintains reporting and other statistical information about the orders placed and goods sold.The Trey Research example application makes use of the technologies described in this chapter to implement solutions for the following scenarios:  The Compliance application. This domain has a trust-relationship with the primary domain within Trey Research. . showing how a solution could be implemented. Trey Research uses various transport partners to physically deliver goods to customers. this application exposes a series of DCOM interfaces to which the compliance application connects for this purpose. Figure 22 shows the structure of the compliance system. The compliance application also sends data to the monitoring application which is also located on-premises. The compliance application provides these credentials when writing reports to this share. Trey Research implemented a separate small domain with its own DNS service in the on-premises infrastructure specifically for hosting the Azure Connect Agent. which verifies that orders comply with government regulations and export restrictions is hosted in a VM Role. The same approach is used to protect the DCOM interfaced exposed by the monitoring application. The management and monitoring applications running on-premises receive data from the Compliance application across Azure Connect. and the management application running in the primary domain can periodically retrieve the reporting data and analyze the information logged by the monitoring application. and the monitoring application. the reporting data.

Additionally. SQL Azure Reporting is configured at the US North Data Center to retrieve. The system uses SQL Azure DataSync to propagate data between datacenters and ensure that they are all synchronized. and format data in summary form for use by the Reporting Service application that runs on-premises.Figure 22 Structure of the compliance system Publishing Reporting Data Trey Research stores and manages the details of the orders placed by customers by using a SQL Azure database located at each datacenter. Trey Research decided to make some of this data available to selected external partners through a Web service. Figure 23 shows the conceptual view of these elements of the system. . analyze.

The code that initiates the service is located in the OpenServiceHost method in the Global. [OperationContract] double SalesByRegion(string region). The implementation of this service is available in the OrdersStatistics.asax. in this first version.cs file in the Services folder of the HeadOffice project: C# [ServiceContract] public interface IOrdersStatistics { [OperationContract] double SalesByQuarter(int quarter). Therefore. which runs on-premises within Trey Research.Figure 23 Conceptual view of the reporting system exposed to external partners The development team at Trey Research chose to implement a trial version of this functionality while the organization gathered more details on the type of information that external partners should be able to view without impinging on the commercial confidentiality of the business. The HeadOffice application. Trey Research simply published the total value of goods sold. } Subsequent versions of this service may provide a more detailed breakdown of the sales data. These operations simply retrieve the requested data from the underlying database and pass it back as the return value. hosts a WCF service called OrderStatistics that implements the SalesByQuarter and SalesByRegion operations.cs file in the Services folder of the HeadOffice project.cs file also in the HeadOffice project. as defined by the service contract available in the IOrdersStatistics. broken down by reporting quarter or by region. .

the service path. </appSettings> .microsoft.ServiceBus config--> <add key="serviceBusNamespace" value="treyresearchscenario" /> <add key="UriScheme" value="sb" /> <add key="RelayServicePath" value="RelayedOrdersStatistics" /> . see the topic “Auto-Start Feature” at http://msdn. using the service path RelayedOrdersStatistics.aspx.Services. enabling the application to accept incoming requests but preventing it from performing other operations such as sending requests: XML <?xml version="1. The details of the Service Bus namespace.IOrdersStatistics" name="RelayEndpoint"/> </service> </services> <behaviors> <endpointBehaviors> <behavior name="SharedSecretBehavior"> <transportClientEndpointBehavior credentialType="SharedSecret"> <clientCredentials> <sharedSecret issuerName="headoffice" issuerSecret="<data omitted>" /> </clientCredentials> . and only works in the sample application because the user is expected to explicitly start the web application that hosts the service.OrdersStatistics"> <endpoint behaviorConfiguration= "SharedSecretBehavior" binding="netTcpRelayBinding" contract= "HeadOffice.. In other situations. and the application publishes the service through a TCP endpoint... it may be preferable to utilize the Auto-Start feature of Windows Server AppFabric to initialize the service.Services. this identity is granted the privileges associated with the Listen claim.. as described in the section “Guidelines for Securing Service Bus Relay” earlier in this chapter. The connection to the Service Bus is secured through ACS by using a secret <system. Notice that the web application connects to the Service Bus by using the identity named headoffice... The service is made available to external partners and users through Service Bus Relay.serviceModel> <services> <service name="HeadOffice. <appSettings> . <!-. and the service configuration are stored in the Web.0" encoding="utf-8"?> <configuration> . For more information. the security credentials..config file of the project.The technique used by the sample application to start the OrderStatistics service is not necessarily applicable to all web applications. using a netTcpRelayBinding binding.

</transportClientEndpointBehavior> <serviceRegistrySettings discoveryMode="Public" displayName= "RelayedOrdersStatistics_Service"/> </behavior> </endpointBehaviors> </behaviors> . the ExternalDataAnalyzer project connects to the Service Bus by using a specific identifier.... and verify that the SalesByQuarter and SalesByRegion operations function as expected. XML <?xml version="1.. .serviceModel> </configuration> The development team also built a simple command-line application to test the connectivity to the OrderStatistics service through Service Bus Relay. externaldataanalyzer. which has been granted the privileges associated with the Send claim.. </system. The endpoint definition for connecting to the service and the security credentials are all defined in the app. It is a WCF client that establishes a connection to the service by using the Service Bus APIs of the Azure SDK together with the Service Model APIs of WCF.0" encoding="utf-8"?> <configuration> . This application is provided in the ExternalDataAnalyzer project. <system. </system.config file.serviceModel> </configuration> Figure 24 summarizes the structure and implementation details of the OrderStatistics service and the ExternalDataAnalyzer client application. enabling it to submit requests to the service but preventing it from performing other tasks such as listening for requests from other clients. Like the web application. <behaviors> <endpointBehaviors> <behavior name="SharedSecretBehavior"> <transportClientEndpointBehavior credentialType="SharedSecret"> <clientCredentials> <sharedSecret issuerName= "externaldataanalyzer" issuerSecret="<data omitted>" /> </clientCredentials> </transportClientEndpointBehavior> </behavior> </endpointBehaviors> </behaviors> ..serviceModel> ... The connection to the Service Bus is secured by providing the appropriate secret key.

   The following sections described how the Trey Research solution achieves the first three of these aims. The techniques implemented by Trey Research for securing access to the system are described in the section “How Trey Research Authenticates Users and Authorizes Requests” in Chapter 5.Figure 24 Structure of the OrderStatistics service and ExternalDataAnalyzer client application Communicating with Transport Partners Trey Research uses Service Bus Queues. During the design phase. All messaging must operate in a timely manner that does not block the operations in the Orders application or adversely affect the experience of customers placing orders. All reasonable security measures should be taken to prevent an unauthorized third party from intercepting the details of orders placed by a customer. Topics. again without requiring that the code is rewritten to handle multiple instances of this application. the development team at Trey Research insisted that all communications with the transport partners had to fulfill a number of criteria:  The solution must be responsive. All data should be considered sensitive and must be protected appropriately. The solution must be scalable. It must be possible to easily add further transport partners to the solution without needing to rewrite the application logic. it must be possible to host the Orders application at multiple sites in the cloud. . Additionally. it must be fulfilled. and that order is confirmed. and Subscriptions to pass messages between the Order application running in the cloud and the transport partners. The system must not be able to lose orders as messages are passed from the Trey Research solution to the appropriate transport partner. Once a customer places an order. The mechanism must be robust and reliable.

The Orders. The BrokeredMessage class is the type provided by Microsoft in the Microsoft. The Send method uses this MessageSenderAdapter object to send the message. The MessageSenderAdapter. Topics and Subscriptions are described in more detail in Chapter 7.Shared project) which encapsulates asynchronous functionality based on the MessageSender class. see the section “Authenticating When Creating or Connecting to a Queue or Topic” in Chapter 5. the ServiceBusQueue class contains an instance of the MessageSenderAdapter class (defined in the Communication\Adapters folder in the Orders. MessageReceiverAdapter.Shared. the Queue name. The parameter to the Send method must be a BrokeredMessageAdapter object that wraps the BrokeredMessage object created earlier. For an example of using the ServiceBusQueue type to send messages. This library also encapsulates elements of the security model implemented by Trey Research.Implementing Responsive Messaging The section “Guidelines for Sending and Receiving Messages Using Service Bus Queues” earlier in this chapter highlighted the importance of using asynchronous operations to send and receive messages.Shared project implement a similar approach to ServiceBusQueue.Shared library.ServiceBus. Create a BrokeredMessage object and populate it with the required information. encapsulating asynchronous functionality based on the MessageSender and MessageReceiver classes respectively. Call the Send method of the ServiceBusQueue object.Messaging namespace. . As described in chapter 5. an application performs the following steps: 1. 3. The ServiceBusQueueDescription class is a member of the Orders. An abridged form of this code is shown in the next section.Shared project. and brokered messages. Create a ServiceBusQueue object using the ServiceBusQueueDescription object. see the SendToUpdateStatusQueue method in the OrderProcessor class in the TransportPartner project.Messaging namespace.Shared project contains a library of reusable classes that wrap the existing MessageSender.Tests project) to construct mock senders. for more information. Create a ServiceBusQueueDescription object. 4. Sending Messages Asynchronously To send a message to a Service Bus Queue by using the Orders. The purpose of these classes is to abstract the send and receive functionality so that all send and receive operations are performed asynchronously. and BrokeredMessages classes (amongst others) found in the Microsoft. MessageReceiver. and specify the Service Bus namespace. and a set of valid credentials in the form of an access key and the name of the associated identity. Creating an instance of the ServiceBusQueue type connects to the underlying Service Bus Queue in PeekLock mode.ServiceBus. and BrokeredMessageAdapter classes enable the unit tests (in the Orders. receivers. The ServiceBusTopic and ServiceBusSubscription classes in the Orders. 2.

public ServiceBusQueue( ServiceBusQueueDescription description) { Guard. ..BeginSend. . this.description = description.FromAsync(sender..Exception != null) { . IMessageSenderAdapter sender) { Guard.. "description"). Task.. this. Guard.senderAdapter). public void Send(IBrokeredMessageAdapter message) { Guard. "sender"). var sender = messagingFactory. C# public class ServiceBusQueue { private readonly ServiceBusQueueDescription description.senderAdapter = new MessageSenderAdapter(sender).CheckArgumentNull(message... throw taskResult.CheckArgumentNull(message. } public void Send(IBrokeredMessageAdapter message.CreateMessageSender( this.description.The following code fragment shows the implementation of the Send method in the ServiceBusQueue class together with the relevant members used by the Send method: The Guard method used by methods in the ServiceBusQueue class and elsewhere checks that the named parameter has been initialized...CheckArgumentNull(description.EndSend.Send(message. this. . . "message"). it should not be null or an empty string.Exception. "message").. private readonly IMessageSenderAdapter senderAdapter. sender..AttachedToParent) ..Factory . null.ContinueWith( taskResult => { try { if (taskResult..TaskCreationOptions. .QueueName. this. message.CheckArgumentNull(sender.ToLowerInvariant()). } .

This delegate should reference a method that will be run each time a message is received. This is an ordinary message receiver object with no additional functionality. If a message is urgent. the ServiceBusReceiverHandler type. } Notice that the ServiceBusQueue class does not utilize the Transient Fault Handling Exception Block. and calling the Receive method on this object performs a synchronous receive operation. it may require significant effort to perform the processing that it signifies. you need to weigh up the advantages of the declarative neatness of the way in which critical code can be executed and retried against the fine control that you may require when running this code as a background task. Additionally. when a message arrives. and if a message is available the method starts a . The ProcessMessages method stores this delegate locally and then calls the local ReceiveNextMessage method. also in the Orders. To handle this situation.) The ProcessMessages method expects a delegate as its first parameter. this response may not be acceptable. } . These messages will not be processed until the receiver finishes its current work and retrieves the next message. enabling the parent to abandon the Receive operation more easily.Dispose().. In this way.Shared project. provides an implementation of a class that can receive and process messages asynchronously. a receiver using this technique may spend a lengthy period of time being blocked while waiting for messages to appear.AttachedToParent option. } }). The ServiceBusReceiverHandler class provides a method called ProcessMessages that an application can use to asynchronously wait for messages arriving on a Service Bus Queue and process them (the application specifies the queue to listen on as a parameter to the constructor of this class. In the ServiceBusQueue class. see the section “Receiving an Order Message and Sending an Order Status Message in a Transport Partner” later in this chapter) . through the GetReceiver method. When considering using the Transient Fault Handling Application Block. Receiving and Processing Messages Asynchronously The ServiceBusQueue class also creates and exposes a MessageReceiver object that you can use to receive messages. the processing performed by the Send method requires attaching the processing as a child task by using the TaskCreationOptions. In its simplest form. The receive operation is performed asynchronously. it sleeps for a configurable period of time before attempting to receive a message from the queue (the message queue is read in PeekLock mode). The ReceiveNextMessage method implements a simple polling strategy. This is because using the Transient Fault Handling Application Block to start asynchronous processes does not provide the same flexibility as using a Task object..} } finally { message. the purpose of this delegated method is to perform whatever business logic the application requires on each message (for an example. a failure in the child task while sending the message can be detected and handled by the parent. during which time more messages may arrive.

In this case a lengthier interval between messages allows the transport partners’ systems to function more effectively. The polling interval acts as a regulator to help prevent parts of the system becoming overloaded. the ProcessMessage method will attempt to abandon the message and release any locks. with the Topic effectively acting as a load-leveling buffer. The ProcessMessage method uses the delegated method provided by the receiving application to process the task to listen for any subsequent messages and then calling the ProcessMessage method to process the newly received message. The ProcessMessage method implements a simple but effective policy for handling exceptions raised while receiving or processing messages. the ProcessMessage method calls the asynchronous version of the Complete method to remove the message from the queue. specifying a small polling interval if the transport partners have limited computing resources is probably a poor choice. However. . If the message is processed successfully. For example. if the SWT token received from the transport partner is invalid). or processing fails as the result of an authentication failure (for example. Figure 25 illustrates the flow of control through a ServiceBusReceiverHandler object when a receiving application retrieves and processes a message. For example. if a system exception occurs while receiving the message. The ideal value for the polling interval depends on the computing resources available at the transport partner. and the number of worker role instances. a subsequent invocation of the ReceiveNextMessage method may be able to read the message successfully if the error was only transient. especially during periods of high demand when a large number of orders are generated. if the same message fails to be received three times. the message is posted to the dead letter queue. the expected volume of orders.

Figure 25 Flow of control when receiving messages through a ServiceBusReceiverHandler object Implementing a Robust and Reliable Protocol for Handling Orders All messaging between the Trey Research solution and the transport partners is handled through the worker role implemented in the Orders. “Deploying Functionality to the Cloud”. which runs as a separate task on its own thread. the worker role delegates the responsibility for detecting and posting new orders to a NewOrderJob object. the solution tracks the status of orders in the TreyResearch database. Sending an Order Message from the Worker Role When a customer places an order. and any order requests that fail to be sent can be retried. Each instance of the worker role queries the database and retrieves all new orders. and the sample transport partners defined in the TransportPartner project.Workers project. To keep the system responsive. indicating that a customer has placed a new order. Figure 26 Implementation of message flow in the Trey Research solution As described in the section “Retry Logic and Transactional Messaging” in Chapter 4. Figure 26 summarizes the physical implementation of the message flow and the various files that contain the principal classes involved in messaging. the details of the order are stored in the TreyResearch database and a status record is also created. the Orders application implements a simple protocol involving order request and order status update messages passing between the worker role and the transport partners. and posts it to a Service Bus Topic where it is routed to the appropriate transport partner. creates a message containing the delivery details for each order (class NewOrderMessage). Additionally. The NewOrderJob class also includes the appropriate logic for detecting order requests that failed to .

also located in the Connectivity folder. This functionality is implemented in the Connector class in the Connectivity folder. The local transport partner runs a connector on its own infrastructure that connects directly to the Azure Service Bus to retrieve and send messages. Service Bus Topics are described in more detail in Chapter 7. and this class actually handles the connectivity between the transport partner and the Service Bus. and another than delivers goods to more distant customers.  The first parameter to the ProcessMessages method in the ServiceBusReceiverHandler class is a delegated function (specified as a lambda expression in the sample code) that provides the business logic to be performed when an order is received from the Topic.Workers project. This approach enables the worker role to easily monitor the state of the NewOrderJob object and detect whether the task is still running or has failed. When the transport partner receives a request to deliver an order. This queue constitutes a well-known but secure endpoint. the flow of control is:    The OnLoad method instantiates the Adapter object and invokes its Run method. In the DistanceTransportPartner WinForms class. . the connector or adapter (depending on the transport partner) posts an acknowledgement message to a Service Bus Queue. The adapter is implemented in the Adapter class. responses from the service are reformatted as messages and posted back to the Service Bus. examine the Run method of the WorkerRole class in the WorkerRole. Details describing how the NewOrderJob class is implemented are provided in the section “How Trey Research Routes Messages Across Boundaries” in Chapter 7. The Run method in the OrderProcessor class creates a ServiceBusReceiverHandler object to connect to the Service Bus Topic on which it expects to receive orders and calls the ProcessMessages method of this object. The Run method of the Adapter class is inherited from the OrderProcessor class. but for the purposes of this discussion you can consider a Topic to be the same as a Queue. in which case it can log the cause of the exception reported by the task and restart a new instance of the NewOrderJob object. one that handles deliveries to local customers that reside in the same state or neighboring states as Trey Research. The Run method in the WorkerRole class runs the code for the NewOrderJob object as a separate task. available to all transport partners. “Implementing Business Logic and Message Routing”. Both transport partners implement their own systems for tracking and delivering sent correctly and for retrying to send these messages.cs file in the Orders. The distance transport partner exposes a service interface and an adapter running as part of the Trey Research solution interacts with the Service Bus and reformats messages into service calls. The Connector and Adapter classes are both descendants of the OrderProcessor class (in the Connectivity folder). see the section “Interacting with Partner-specific Services” in Chapter 4. For details. Receiving an Order Message and Sending an Order Status Message in a Transport Partner The Trey Research solution includes two sample transport partners. For more information about the strategy of using connectors and adapters. These transport partners are provided in the LocalTransportPartner and DistanceTransportPartner WinForms classes in the TransportPartner project.

taken from the OrderProcessor.cs file.StartNew( // Message conversion logic goes here.).) { . replyTo.Factory. shows how this code is structured. the ProcessMessage method invokes the local SendOrderReceived method of the OrderProcessor object..ProcessMessage(message.. Note that the functionality for sending the message is encapsulated in the ServiceBusQueue class as described earlier. The SendToUpdateStatusQueue method then posts this message to the Service Bus Queue.. } } In the OrderProcessor class. token) => { return Task.. receiverHandler... in the section “Implementing Responsive Messaging. This method creates a new BrokeredMessage object wrapping an OrderStatusUpdateMessage.. When the order has been processed by the transport partner.) { var updateStatusMessage = new BrokeredMessage( new OrderStatusUpdateMessage . . // The message parameter contains the body of // the message received from the Topic. the correlationId property of the message copied from the orderId from the original request (so that the worker role receiving the message can correlate this response with the original request).>(..). Because this processing is performed by a separate task. () => this. This strategy enables you to decouple the mechanics for receiving a message from a Queue or Topic (as implemented by the ServiceBusReceiverHandler class) from the logic for converting this message into the format expected by the transport partner and sending this request to the internal system implemented by the partner. }..” C# private void SendToUpdateStatusQueue(. which in turn calls the SendToUpdateStatusQueue method..ProcessMessages( (message. var receiverHandler = new ServiceBusReceiverHandler<.. the lambda expression invokes the local ProcessMessage method (not to be confused with ServiceBusReceiverHandler....). . This message includes the status of the order. The following example. the ProcessMessage method can synchronously wait for the response without affecting the responsiveness of the application. C# public void Run() { .).. .ProcessMessages) to transmit the message to the local partner’s internal system and wait for a response.... foreach (. The ServiceBusReceiverHandler object invokes this function after it has received each message as described in the section “Implementing Responsive Messaging” earlier in this chapter. and an appropriate security token to identify the sender.

In common with the OrderProcessor class. it includes the details required by the transport partner receiving the message to send the update messages back to the appropriate queue.Workers project. Service Bus Queues. var brokeredMessageAdapter = new BrokeredMessageAdapter(updateStatusMessage). each worker role in the Trey Research solution creates a StatusUpdateJob object to listen for status messages received from the transport partners. When the NewOrderJob processor running as part of the worker role sends an order message to a queue. when a worker role posts a message containing an order to a queue. and provides the business logic that is run for each status message as it is received as a lambda expression. updates the order status with the name of the transport partner actually shipping the goods to the customer. . discards it if it is a duplicate message for an existing order (retry logic in the transport partner may cause it to send duplicate status messages if it detects a transient error). }).ToString(). adds a tracking ID if necessary. TrackingId = trackingId..CorrelationId = orderId...Send(brokeredMessageAdapter). Implementing a Scalable Messaging Solution The Trey Research solution spans multiple datacenters. TransportPartnerName = . } Receiving an Order Status Message in the Worker Role As well as starting a NewOrderJob object to post new orders to transport partners.{ OrderId = orderId. and so on. updateStatusMessage.cs file in the Jobs folder in the Orders. ServiceBusQueue replyQueue. replyQueue. Status = orderStatus. However. The Execute method (described in more detail in Chapter 7) populates the CorrelationId and ReplyTo properties of each order message with the OrderId of the new order and the name of the queue that the StatusUpdateJob processor is listening on: . and then stores this status in the TreyResearch database (other parts of the application can query this status if the customer wishes to know whether an order has been shipped yet. it is important for the transport partner to send any order status update messages back to a queue at the correct data center. simply by deploying further instances of the various components at additional datacenters. replyQueue = . This design enables the solution to scale. for example)... All interactions with the database in the lambda expression referenced by the Run method use the Transient Fault Handling Application Block.. the Run method in the StatusUpdateJob class employs a ServiceBusReceiverHandler object to actually connect to the queue and retrieve messages. This class is located in the StatusUpdateJob. The lambda expression checks the status message.. SQL Azure databases. Each datacenter hosts its own set of web applications. worker roles.

. The Service Bus Relay service in the cloud provides the necessary protection to prevent unauthorized client applications from connecting to the service.UserName. ReplyTo = this. ShippingAddress = orderProcess.Order.Order.OrderId.ToString(). Azure Connect enables you to establish a virtual network connection between a role hosted in the cloud and your on-premises infrastructure.. You can share data across this network connection in a similar manner to accessing resources shared between computers running on-premises.. and this class saves the CorrelationId and ReplyTo properties as it receives and processes each message. var brokeredMessage = new BrokeredMessage(msg) { CorrelationId = msg..UserName }. } The connector and adapters implemented by the transport partners use the OrderProcessor class to receive messages. .. The transport partners invoke the SendToUpdateStatusQueue method of the OrderProcess class to send order status update messages. . . Amount = Convert....OrderId. OrderId = orderProcess.OrderDate.Address. and then acts as a conduit for transparently routing requests from authenticated clients to the services running on-premises.Order.Order.Order.ToString(). SessionId = orderProcess... // Create new order message var msg = new NewOrderMessage { MessageId = orderProcess.Total). . CustomerName = orderProcess. Summary This chapter has looked at the principal Azure technologies that you can use to implement cross-boundary communication in a hybrid application.OrderId. // Send the message. Service Bus Relay enables an organization to implement well-known endpoints that can be accessed from the outside world.C# private void Execute() { . Service Bus Relay provides a secure mechanism for making your on-premises services available to external users.replyQueueName }.ToDouble(orderProcess.Order. These technologies provide a foundation that you can use to construct elegant hybrid solutions comprising components that need to communicate across the cloud/onpremises divide... and this method uses the saved data to determine the queue to post the messages to. OrderDate = orderProcess.

More Information   “Overview of Windows Azure Connect” at http://msdn.aspx      For links and information about the technologies discussed in this chapter see Chapter 1.aspx “Windows Azure Service Bus & Windows Azure Connect: Compared & Contrasted” at http://windowsazurecat. you can implement a number of common messaging patterns and adapt them as appropriate to the requirements of your system. Topics and Subscriptions are described in detail in Chapter “Best Practices for Leveraging Windows Azure Service Bus Brokered Messaging API” at http://windowsazurecat. .com/2011/08/windows-azure-service-bus-connect-compared-andcontrasted/ “Windows Azure AppFabric: An Introduction to Service Bus Relay” at “An Introduction to Service Bus Queues” at published through Service Bus Relay are inherently oriented towards the RPC model of “Achieving Transactional Behavior with Messaging” at http://blogs.msdn.msdn. and can easily be made robust enough to handle network and communications Service Bus Queues are also the basis of Service Bus Topics and Subscriptions which support a variety of advanced messaging scenarios. Using Service Bus Queues. Message-oriented applications are highly-suitable for cloud environments as they can more easily handle the variable volumes and peak loading experienced by typical of many commercial "Introduction" of this guide. Service Bus Queues enable you to implement asynchronous messaging that helps to remove the temporal dependency between the client application posting a request and the service receiving the “Building loosely-coupled apps with Windows Azure Service Bus Topics and Queues” at http://channel9.

you can monitor and audit messages without interrupting the natural progression from sender to receiver. For example. In many solutions. but the system still needs some means for physically directing messages to a destination. Building an extensible solution based on these use cases typically requires that you address the specific challenges described in the following sections. or new destinations are added. The implementation determines the degree of independence that message senders and receivers have from each other. For example. Use Cases and Challenges Many hybrid applications must process business rules or workflows that contain conditional tests. an application may need to update a stock database. a sender using RPC-style communications to communicate with a WCF service might specify the address of the destination. send the order to the appropriate transport and warehouse partner. These operations may involve services and resources located both in the cloud and on-premises. reliable messaging strategy enables secure point-to-point communications between components participating in a distributed system.7 Implementing Business Logic and Message Routing across Boundaries A simple. without reworking the underlying business logic. or an application implementing message-oriented communications might direct messages to a queue that a specific receiver listens on. and store the order in another database for accounting purposes. and presents possible solutions and good practice for decoupling this data flow from the application logic when using Windows Azure. Decoupling the data flow from the business logic of your applications brings many benefits. or you can easily integrate additional services into your solution by extending the list of message destinations. This chapter examines some of the common challenges associated with directing and controlling the flow of messages. . perform auditing operations on the content of the order (such as checking the customer's credit limit). and which result in different actions based on the results of the rules. this mechanism may be built into the application logic. by following this strategy you can transparently scale your solution to incorporate additional service instances to process messages during times of heavy load. This opaque approach can make it difficult to change the way in which messages are routed if destinations change their location.

and equally it should not be incorporated into the elements that receive and process messages. The routing mechanism should be robust and compatible with whichever approach you have taken to implementing reliable messaging. The middleware elements effectively act as a message broker. Decoupling the message routing from the application logic also enables you to partition messages based on administrator-defined criteria. However. The logic that controls the routing of messages must reside somewhere. Separating the business logic from the message routing helps to achieve this location independence. The approach that you take to implement this broker should exhibit the following properties:  The routing logic must be configurable. The gathering and processing aspects may involve sending requests and passing messages to other services. requests from advisors 31 to 60 might be directed to another instance. examining. This implies that the data flow logic must be contained within the middleware components that connect senders to receivers. As the mortgage application is rolled out in new offices and more advisors come on stream. run more instances of the service and reconfigure the system to distribute mortgage processing requests more evenly. if a service migrates to a different address. requests from advisors 1 through 30 might be directed to one instance of the service. and so on. you should not have to modify the way in which the application works if it is still performing the same business functions. if necessary. to reduce response times an administrator might decide to run multiple instances of a service that processes mortgage requests in the cloud. and processing data. and then route these messages to the most appropriate instance of a service. but where should it be placed? If you are aiming to completely decouple this flow from the business logic of the application it should not be managed by the code that sends messages. and these requests can be transparently routed to a specific instance of the mortgage processing service based on some attribute of the mortgage advisor such as their assigned advisor number. the administrator can monitor how the workload of the services is divided and. independent from components sending and receiving messages. intelligently routing messages to destinations. An application running on a mortgage advisor’s desktop can submit mortgage requests. For example. an application needs to send and receive messages from other services which may be located anywhere. Separating the Business Logic from Message Routing Description: As part of its business processing. and scalable as the traffic to your services increases. The business operations performed by a distributed application are primarily concerned with gathering.  . the underlying business logic should not be tied to the location of these services.Designing the data flow to be independent from the implementation of the application logic can help to ensure that your solution is adaptable and extensible if business requirements should quickly change. These criteria are independent of the application.

Security The messaging infrastructure must be robust and secure.Routing Messages to Multiple Destinations Description: As part of its business processing. such as an auditing service for example. and this list of services may vary over time. it should prevent unauthorized applications and services from sending or receiving messages. . If your solution copies and dispatches messages to multiple destinations. message duplication must be performed in a controlled and secure manner. and the routing mechanism must reliably arrange for messages to be sent to the correct destination(s). The following sections summarize these areas of concern. It is frequently necessary to transmit the same message to different receivers. The messages should be delivered transparently when the issue is resolved. an application may need to send the same message to any number of services. This in turn means that the middleware elements must be configurable to support this message duplication without introducing any dependencies on the sending or receiving logic. you can easily include the additional destinations without modifying the applications that send messages to them. such as an order shipping service and a stock control service in an order processing system. Being able to transparently route messages to multiple destinations enables you to build extensible solutions. the business logic should be allowed to continue. The messages must be protected and made available only to their intended recipients. Reliability The underlying messaging technology must not lose messages as they are transmitted. If your solution needs to incorporate new partners or services. If message confidentiality is paramount. and the cross-cutting concerns relating to message routing are essentially a superset of those that you must address when implementing cross-boundary communications. This use case requires that the middleware components that route messages can transparently copy them and send them to the appropriate destinations. again without the middleware requiring direct access to the data held in message bodies. Cross-Cutting Concerns Message routing has a high dependency on the underlying messaging platform. If the business logic sends messages that cannot be immediately delivered because of some transient problem in the infrastructure (such as a network failure between the messaging platform and the destination service). then you should be able to encrypt the message body without losing the ability to route messages. Responsiveness and Availability The messaging platform should not inhibit the flow of the business logic of your system. the middleware implementing the router should be able to make decisions on where to send a message without needing to examine the data in the body of the message. This restriction also applies to the routing technology you use.

If you don’t specify how to filter data when you create a subscription. “Implementing Cross-Boundary Communication. responsive. all a sender has to do is provide the metadata (in the form of message properties and values) that the filters can examine and use to make routing decisions. Filters enable you to define simple message routing rules that might otherwise require writing a substantial amount of code. taking advantage of the security and reliability that Service Bus Queues provide. A message is routed from a Service Bus Topic to a Service Bus Subscription by defining a filter that examines the metadata and custom properties attached the message. A sender application posts a message to a Service Bus Topic using much the same technique as when posting to a Service Bus Queue. the sender typically adds one or more custom properties to the metadata of the message.” Azure extends Service Bus Queues with Service Bus Topics and Service Bus Subscriptions. they enable messages to be routed to one or more receivers by using information stored in the metadata of these messages. A Service Bus Queue is notionally a simple first-in-first-out structure with additional features such as timeouts. . However. A filter is a predicate attached to a Service Bus Subscription. Separating the Business Logic from Message Routing Using Service Bus Topics and Subscriptions Service Bus Topics and Subscriptions enable you to direct messages to different receivers based on application-defined criteria. Guidelines for Using Service Bus Topics and Subscriptions to Route Messages Service Bus Topics and Subscriptions are suitable for implementing simple. They provide the advantages exhibited by Service Bus Queues facilitating decoupling a message sender from a receiver. the default filter simply passes all messages from the topic through to the subscription. reliable. a receiver application waits for incoming messages by connecting to a Service Bus Subscription. and this information is used to route the message to the most appropriate receiver. A Service Bus Queue enables a variety of common messaging scenarios as described in Chapter 6. but in addition. A Service Bus Topic can be associated with multiple Service Bus Subscriptions. and dead-lettering. a Service Bus Subscription represents the receiving end. The routing mechanism should be compatible with the technologies on which these services are based. transactional support. and interoperable messaging between distributed services and applications is Service Bus Queues.Interoperability You may need to route messages between services built by using different technologies. All subscriptions have an associated filter. and all messages that match the predicate are directed towards that Service Bus Subscription. The following sections supply more detail on using Service Bus Topics and Subscriptions to implement message routing. Azure Technologies for Routing Messages The primary Azure technology that supports safe. The filters that direct messages from a Topic to a Subscription are separate from the business logic of the sending and receiving applications. These extensions enable you to incorporate a range of message routing options into your solutions. static routing of messages from a sender to a receiver. While a Service Bus Topic represents the sending end of a queue.

or by air for overseas customers.Figure 1 depicts the message flow from a sender though a Topic and two Subscriptions to the corresponding receivers. by train and van for more remote national customers. similar but not identical to that implemented by Trey Research (see Chapter 4. and Receiver C receives messages for all weights of 200 and over. Receiver B receives messages for weights between 100 and 199. as follows:  Your system generates a number of messages.NET. New receivers may be added over time. and messages are filtered by the Weight property added by the sender. The vendor organization uses a number of transport partners to deliver goods. Figure 1 Routing messages to different receivers through a Service Bus Topic and Subscriptions Service Bus Topics and Subscriptions expose a programmatic model through a series of APIs in the Windows Azure SDK for . Service Bus Topics and Subscriptions enable you to address a number of scenarios with requirements beyond those that you can easily implement by using Service Bus Queues. consider an order processing system for a vendor organization. each of which must be handled by a specific receiver. These orders must be fulfilled and shipped to customers. “Deploying Functionality and Data to the Cloud” for details of the approach that Trey Research adopted). As an example. Like Service Bus Queues. these APIs are wrappers around a series of REST interfaces. but for ease of maintenance you don’t want to have to modify the business logic of the sender application when this occurs. delivery might be made by van for local customers. and the delivery partner selected depends on the location of the customer. Receiver A receives all messages where the value of the Weight property is less than 100. The data represents parcels being shipped. so you can utilize Topics and Subscriptions from technologies and platforms not directly supported by the Azure SDK. customers place orders using a web application. In this scenario. The orders web application is not actually concerned with which transport . This is the primary scenario for using Service Bus Topics and Subscriptions rather than Queues.

The transport partner for location B does not publish a public interface for its services. Transport partners are typically third-partner organizations that have their own systems for tracking and shipping orders. if a transport partner does not publish a programmatic interface for interacting with its systems but is willing to host the logic for connecting to the Service Bus Subscription on its own premises.partner is used. and so on. translate the messages into the format expected by the transport partner. The logic for communicating with Service Bus can be factored out into a separate connector component to provide ease of maintenance should the communication mechanism with the vendor organization change in the future. see the section “How Trey Research Routes Messages across Boundaries” later in this chapter. and then communicate with the transport partner using its own web service interface. You should factor out the logic that determines the location code of the customer from the main business logic of the application. This is the “pull” model for messaging. The Subscriptions used by these adapters do not have to be exposed to the world outside of the vendor organization. Figure 2 shows the architecture of a possible solution. . The location could actually be a code. many commercial transport providers provide their own custom web services for interacting with their systems. and these systems are unlikely to be based on Service Bus Subscriptions. if the algorithm that defines the various location codes changes. This may necessitate implementing federated security across the Service Bus and the transport partner’s own security domain. In this way. For more information. with a filter that examines the location metadata and directs the order message to the Subscription as appropriate. but instead implements its own connector logic for pulling messages from the Service Bus Subscription. “Location B” might indicate that the customer resides elsewhere in the United States. so the authentication and authorization requirements can be handled directly by using Service Bus security and ACS. These adapters are the responsibility of the vendor organization. and then invoke these processes. it can connect directly to the Service Bus Subscription. The vendor organization must make the Service Bus Subscription accessible to the transport partner. the business logic of the application does not have to be updated. The sample application provided with this guide implements a slightly different approach to that described here as the choice of transport partner is more limited. Therefore it may be necessary to construct a set of adapters that retrieve the messages for each transport partner from the appropriate Service Bus Subscription. for example “Location A” might indicate that the customer lives within the same or a neighboring state as Trey Research. Alternatively. Occasionally new transport partners may be added or existing partners removed. This is the “push” model for messaging. but simply posts the relevant details of each order to a Service Bus Topic together with metadata that indicates the location of the customer. retrieve and reformat messages into a style compatible with its own internal tracking and shipping processes. Each transport partner has its own Service Bus Subscription to this Topic. “Location C” could specify that the customer lives in Europe. but the business logic in the orders web application should not have to change when this happens. The transport partners for locations A and C expose functionality as a set of web services so the vendor organization uses adapters to communicate with them.

Figure 2 Decoupling a sender application from the message routing logic using a Service Bus Topic and Subscriptions  Your system generates a variety of messages. others are less urgent and can be handled at some later convenient time. Lower priority messages can be handled by a subscription with a fixed number of receivers. Figure 3 shows the structure of this system. Urgent messages can be handled by a subscription with a pool of message receivers. Messages with a limited lifetime can be handled by using a subscription with a short value for the DefaultMessageTimeToLive property. messages marked as Important must be processed . Additionally. Some of these message are high priority and must be processed as soon as possible. This technique is similar to the “fan-out” architecture for queues handling a sudden influx of messages described in the section “Guidelines for Using Service Bus Queues” in Chapter 6. while still further messages have a limited lifetime. implementing a “load-leveling” scheme as described in Chapter 6. it can tag it with a Priority property. if no trace of the message is required after it has expired. When a sender posts a message. if a message expires before it is received it will automatically be discarded. You can monitor the queue length of this subscription. if they are not handled within a specified timeframe they should simply be discarded. Service Bus Topics and Subscriptions provide a mechanism to implement a “priority queue”. you can set the EnableDeadLetteringOnMessageExpiration property of the subscription to false. In this example. messages marked as Critical are high priority and must be processed immediately. and you can define subscriptions that filter messages by this property. In this configuration. and if it exceeds some predefined length you can start new listeners.

Service Bus Subscriptions support the notion of Subscription Correlation. Service Bus Topics and Subscriptions provide a better variation on this approach in a dynamic environment. This mechanism enables an application to connect to a topic and create a subscription that filters messages based on the CorrelationId property. ideally if a queue is no longer required it should be removed. the sender specifies the queue on which to send the response in the ReplyTo property of the message. and correlate this response with the original message by using the CorrelationId property of the message. and each queue has an associated monetary cost. The number of senders can vary significantly over time. and the receiver populates the CorrelationId of the response message with a copy of the MessageId from the original request message. This approach is very straightforward and suitable for a reasonably small and static set of senders. Figure 3 Prioritizing messages by using a Service Bus Topic and Subscriptions  Senders posting messages expect a response to this message from the receiver. Service Bus Subscriptions provide the CorrelationFilter filter specifically for this purpose. This is because each sender requires its own Service Bus Queue.soon (ideally within the next 10 minutes). these queues take time to construct. Section “Guidelines for Using Service Bus Queues” in Chapter 6 describes how a sender can post a message to a queue and receive a response on another queue. but it does not scale well and can become unwieldy if the number of senders changes quickly. while messages marked as Information are non-urgent and if they are not handled within the next 20 minutes the data that they contain will be redundant and they can be discarded. To implement subscription correlation. you perform the following tasks: ◦ The sender creates a message and populates the MessageId property with a unique value. .

but the filter ensures that each sender only receives the responses to the messages that it sent. if more than one expression matches then the same message will appear multiple times. and it provides an extremely efficient mechanism for filtering messages. The receiver posts the response message to the Service Bus Topic shared by all sender applications. and adds a CorrelationFilter to this subscription specifying the MessageId property of the original message. A message passes through to a subscriber as long as one filter expression matches the message properties.◦ The sender connects to the Service Bus Subscription on which it expects to receive a response. A receiver retrieves the message. once for each match. However. and then processes the message. When a message with a value in the CorrelationId property that matches the original MessageId appears in the Topic. ◦ ◦ ◦ ◦ ◦ The sender posts the message to a Service Bus Topic on which one or more receivers have subscriptions. although a SqlFilter filter is more flexible. In contrast. based on any filtering applied to the subscription. The receiver creates a response message and populates the CorrelationId property with a copy of the value in the MessageId property of the original message. it has to undergo lexicographical analysis when it is created. Figure 4 shows the structure of this of this solution. The CorrelationFilter filter has been designed specifically for this scenario. . and it has greater runtime costs. the CorrelationFilter for the appropriate subscription ensures that it is passed to the correct sender. All senders share this same Topic. A single subscription can have more than one associated filter.

the receivers compete for . Service Bus Topics and Subscriptions provide a useful mechanism for partitioning the messageprocessing workload. but you have found that you need more control over which receivers process which messages. Additionally. multiple receivers can listen to the same queue. you need to direct messages to receivers running on specific servers. you previously decided to implement a scale-out mechanism by using Service Bus Queues. You can define a set of mutually exclusive filters that cover different ranges of the values in the message properties and direct each range to a different Subscription. messages marked as Warehouse B are handled by a receiver running on a server close to warehouse B. Service Bus Queues enable you to implement a simple load-balancing mechanism for processing messages. and so on. and the result is an approximate round-robin distribution of messages. As described in the section “Guidelines for Using Service Bus Queues” in Chapter 6. you may require more control over which receiver handles which messages. based on one or more properties of each message. your system may require that all messages with the Warehouse property set to A are processed by a receiver running on a server physically close to warehouse A. For example. or they can be implemented as worker roles in the cloud. large volume of messages. which the round-robin approach does not provide. To maintain performance. However. The various receivers listening to these Subscriptions can run on specific servers on-premises.Figure 4 Using Subscription Correlation to deliver response messages to a sender  Your system handles a continuous. In this case. each Subscription can have multiple receivers.

messages from that Subscription echoing the round robin load balancing technique used for a Queue. use a SubscriptionClient object. The receivers are all built using the Azure SDK and connect directly to the various Service Bus Subscriptions. Figure 5 Scaling out by using a Service Bus Topic and Subscriptions If you originally built the sender application by using the Send method of a MessageSender object to post messages to a Service Bus Queue. Figure 5 shows an example structure for the warehouse system. All you need to do is create and populate the appropriate message properties required by the Subscription filter before sending them to the Topic.  Messages received from a Subscription may be need to be forwarded to a final destination for additional processing. All orders must be logged. This examination is performed by a separate process. depending on system-defined criteria. so messages for warehouse B are handled by multiple receivers. To receive messages from a Service Bus Subscription. The threshold value may change. the order must also be subjected to additional scrutiny. This forwarding mechanism should be transparent to the sender as the forwarding criteria and final destinations may change. but the business logic for the receiver needs to be insulated from this change. Warehouse B expects more traffic than warehouse A. but if a customer sends an order with a value greater than some threshold figure. The receiver applications should also be decoupled from any changes to these criteria and final destinations. you do not have to modify this part of the code because you can use the same method to post messages to a Topic. Consider the example of an ordering processing system where a web application sends orders to a receiver through a Service Bus Topic. implementing various rules specified by the organization’s order-handling policy. all running on hardware located locally to warehouse B. .

together with other properties that are used to route the message to a receiver (labeled “Forwarding Receiver”). the filters that you define cannot access the body of a message. or High according to the cost. Medium. The sender adds the total cost of the order as a property to the initial order message. the auditor might choose to cancel the order if the customer does not meet certain requirements. This action is performed transparently as the message is retrieved by the receiver. The Auditing Receivers’ subscriptions filter the message by the PriceRange property to route them to the receiver that performs the appropriate operations. if the PriceRange is Low the order is simply logged. The receiver retrieves the message and performs whatever processing is required. a cost below 100 is Low. a cost between 100 and 500 is Medium. The value of the PriceRange property is set to Low. and if the PriceRange is High the order is logged and the details are emailed directly to an auditor for immediate examination. or remove a message property when the message is retrieved from a Service Bus Subscription. For security reasons. if the PriceRange is Medium the order is logged and the details are printed for later scrutiny by an auditor. A filter is applied that segregates the messages according to the order cost and adds a property called PriceRange to each message. add. you define filters by using the SqlFilter class. If you require routing . Figure 6 shows a possible structure for the order processing example.You can accomplish this decoupling by using a filter rule action. Figure 6 Forward-routing messages by using a filter rule action Limitations of Using Service Bus Topics and Subscriptions to Route Messages Service Bus Topics and Subscriptions only implement a simple routing mechanism. for example. but these filters do not support functions. A filter rule action can change. the conditions specified in these filters are limited to a subset of SQL92 syntax. so they can only make decisions based on data exposed in message properties. there is no Substring function. For optimization purposes. to the Service Bus Topic that the Auditing Receivers subscribe to. You can perform direct comparisons of data and values by using common arithmetic and logical operators. The receiver can create a copy of the message to perform its own processing. and repost the received message to another topic that routes the message on to the appropriate destination based on the updated property set. which now has a PriceRange property appended. for example. and cost greater than 500 is High. it posts a copy of the received message. Most commonly. At the same time.

The converse situation is also true. Guidelines for Using Service Bus Topics and Subscriptions to Route Messages to Multiple Destinations Filters with overlapping predicates enable a number of powerful scenarios. and Synchronizing Data” includes some additional patterns for using Service Bus Topics and Subscriptions to query and update data in a system that uses replicated data sources. “Replicating. The following list describes some common cases: Chapter 8. and each message is sent exclusively to a single subscription. . Services can use subscriptions to retrieve their intended messages.SqlExpression Property” on MSDN at the TrueFilter is applied automatically. if all subscriptions have filters that fail to match the properties for a message it will remain queued on a Service Bus Topic until it expires.sqlfilter. Figure 7 shows an example system that uses a TrueFilter to copy messages to an audit log for later examination. This is an example of the most general pattern for posting messages to multiple destinations.  Your system enables sender applications to posts requests to services.sqlexpression. see the topic “SqlFilter. However. and any subscription that utilizes this filter will automatically be fed with a replica of every message sent to the topic. Routing Messages to Multiple Destinations Using Service Bus Topics and Subscriptions The previous section described using filters that partition messages into distinct non-overlapping groups and direct each group to a Service Bus Subscription. Distributing. The TrueFilter is the default filter for a subscription. you must implement this logic in your own code by creating a receiver that examines the data of a message and then reposting it to another queue or topic as necessary.servicebus. it is also possible for different subscriptions to have filters with overlapping predicates. For more information about the types of expressions supported by the SqlFilter class. if you don’t specify a filter when you create a subscription. a copy of the same message is routed to each matching subscription. In this This filter matches all messages. The Windows Azure SDK provides the TrueFilter type specifically for this purpose. This mechanism provides a means for routing messages to multiple destinations. This logging must be transparent to the sender applications. but all of these requests must be logged for auditing or diagnostic purposes.messaging.aspx. but all messages must additionally be posted to a logging service so that they can be recorded and stored.based on more complex rules.

The processes that listen for events are inherently asynchronous and are decoupled from the processes that raise events. especially given the requirement that the event handlers may be located anywhere and events should be delivered reliably. and your system must be able to add or remove processes that can handle these events without impacting the business logic of your system. Rather than using a TrueFilter.  You system raises a number of business events. and these messages will be routed to the destination receivers expecting to process these message as well as the Audit Log Receiver application. Processes that trigger events do so to inform interested parties that something significant has happened. Of course. you can also be more selective. Each event may be handled by zero or more processes. giving an indication of the message throughput and performance of your solution. . For example. you could implement an application that measures the number of messages flowing into your system over a given period of time and displays the results.Figure 7 Logging messages by using a TrueFilter The Audit Log Receiver is simply an example application that may benefit from this approach. Any functionality that requires a copy of messages that pass through your system can be implemented in a similar way. you can define an overlapping SqlFilter that captures messages based on property values. Using Service Bus Topics and Subscriptions provides an excellent basis for building such a system. The event handlers may be running remotely from the processes that raise the events.

they will compete for event messages routed to the subscription. Each application that is interested in an event can create its own subscription where the filter specifies the conditions for the messages that constitute the event. and when it occurs the software driver starts the machine. Each machine involved in the assembly process is driven by software that listens for this message. Figure 8 Controlling a production line by using events based on Service Bus Subscriptions For a detailed example and more information about using Service Bus to communicate between roles and applications. and hardware can be added or removed without requiring the Controller application to be modified. The Controller application has no knowledge of the machinery involved in the production line. see "How to Simplify & Scale Inter-Role Communication Using Windows Azure Service Bus" at . Similarly. When production is due to start. the Controller application posts a “Stop Machinery” message. then it will be discarded when this period expires. and the software drivers for each machine shut it down in a controlled manner. Figure 8 shows an example from a basic system controlling the assembly line in a manufacturing plant. the Controller application posts a “Start Machinery” message to a Service Bus Topic. an application can notify interested parties of an event simply by posting a message that contains the event data to a Service Bus Topic. Do not attempt to share the same event subscription between two separate applications if they must both be notified of the event. and each message will only be passed to one of the applications. when production is halted. so that if no applications subscribe to the event. The topic on which a sender application posts event messages can have the DefaultMessageTimeToLive property set appropriately.In messaging terms.

“Authenticating Users and Authorizing Requests” for more information about using ACS. If a topic has a large number of subscriptions (a topic can have up to 2000 subscriptions in the current release of Service Bus). See Chapter 5.http://windowsazurecat. How Trey Research Routes Messages across Boundaries The Trey Research example application that is available in conjunction with this guide demonstrates many of the scenarios and implementations described in this chapter. . and there will inevitably be some delay caused by the network latency associated with the Internet. all communications with Service Bus Topics and Subscriptions occur over a TCP channel and are automatically protected by using SSL. even if these components are running at the same site or on the same computer. As with Service Bus Queues. and the properties attached to each message must be examined and compared against every filter associated with each subscription. You can download the example code by following the links at http://wag. When an application connects to a Service Bus namespace Topic or Subscription. while Service Bus Topics and Subscriptions can provide reliable delivery of messages to one or more destinations. You can grant the privileges associated with the Manage and Send claims to a Topic. Topics and Subscriptions reside in the For these reasons.codeplex. How Trey Research implemented rudimentary business logic and message routing using Filters with a Service Bus Topic. you configure security. Limitations of Using Service Bus Topics and Subscriptions to Route Messages to Multiple Destinations It is important to understand that. now would be a good time to download the example so that you can examine the code it contains as you read through this section of the guide. and associate privileges with these identities by using ACS. Security Guidelines for Using Service Bus Topics and Subscriptions Service Bus Topics and Subscriptions are subject to the same security mechanism as Service Bus Queues. and the privileges associated with the Manage and Listen claims to a Subscription. create identities. filters defined by using the SqlFilter type are subject to runtime evaluation. you should not depend on topics and subscriptions if you require real-time notification of events or messages passing between components. If you have not already done so. then this evaluation and examination may take some time to perform. you should use an alternative technology such as Microsoft Message Queuing (for on-premises applications) or Azure Queues (for web and worker roles). This section of the chapter describes:    How Trey Research uses a custom class that wraps the methods for sending messages to Service Bus Topics. it must authenticate with ACS and provide the identity used by the application. Additionally. How Trey Research implemented a custom retry mechanism to provide reliability when sending messages to a Service Bus Topic. this delivery is not instantaneous. If you need more immediate delivery in these situations. See Chapter 6 for further details about connecting to a Service Bus namespace and providing identity information.

The ServiceBusQueue class uses the Enterprise Library Transient Fault Handling Application Block to execute the task of sending messages. However. Action<object> afterSendComplete. to manage the minor differences in these methods. and are located in the Communication folder of the Orders. How Trey Research uses a custom class to create receivers that subscribe to Service Bus Topics. Clients will subscribe to a Topic when required by creating a suitable receiver. A Topic is effectively a specialized type of queue. the ServiceBusTopic class uses the Enterprise Library Transient Fault Handling Application Block to detect transient errors when posting a message to a topic and transparently retrying the Send operation when appropriate.Shared project. The Service Bus methods used to send messages to a Topic are very similar to those used to send messages to a Queue. the developers at Trey Research created two custom classes specially targeted at making it easier to use Service Bus Topics. This involves the custom classes ServiceBusQueueDescription and ServiceBusQueue that developers at Trey Research created to simplify development and promote code reuse. As described previously. or when there is an error. instead of using the static Task. It also separates the business logic from the message routing concerns. This removes coupling between the sender and receiver. The Send method of the ServiceBusTopic class accepts two Action instances that it executes when the message has been sent. the ServiceBusTopic class does not instantiate a receiver.” describes the fundamental techniques Trey Research uses to send messages to a Service Bus Queue.Factory method directly to execute the task asynchronously. The only difference between the ServiceBusQueueDescription class and the ServiceBusTopicDescription class is the name of one of the properties. and has similar properties. Action<Exception. the two Action methods (one to execute after a message is sent and one to execute if sending fails) are passed to the method as parameters. object> processError) { . The ServiceBusQueue class sends the message and only logs and raises an exception if sending the messages fails.. together with the message itself and the object state. These classes are ServiceBusTopicDescription and ServiceBusTopic. “Implementing Cross-Boundary Communication. and allows different clients to subscribe and receive messages without needing to reconfigure the Topic.. The ServiceBusQueue class and the ServiceBusTopic class differ in several ways:  Unlike the ServiceBusQueue class. Sending Messages to a Service Bus Topic The section “Implementing Responsive Messaging” in Chapter 6. object objectState. } .   The following code shows the outline of the Send method in the ServiceBusTopic class. C# public void Send(BrokeredMessage message. In contrast.

The Send method uses the Transient Fault Handling Block. () => { try { afterSendComplete(objectState). this. Exception:{2}". loads the default policy for Azure Service Bus connectivity.serviceBusRetryPolicy = RetryPolicyFactory . args.CurrentRetryCount.BeginSend(message. The constructor for the ServiceBusQueue class initializes the block.serviceBusRetryPolicy. Figure 9 is the overview you saw in Chapter 4. C# this." + " Delay:{1}. The Send method then calls one of the asynchronous overloads of the ExecuteAction method of the Transient Fault Handling Block and passes in the required parameters. These parameters are the asynchronous start and end methods.Delay.sender. ar => this. and messaging between the Orders application. “Deploying Functionality and Data to the Cloud.EndSend(ar).GetDefaultAzureServiceBusRetryPolicy(). objectState)).TraceWarning("Retry . transport partners.ExecuteAction( ac => this. C# this. and sets up a handler for the Retrying event that writes information to Windows Azure diagnostics.Count:{0}.Retrying += (sender. ac. an action to execute after the process completes successfully.serviceBusRetryPolicy. as shown in the next code extract. } finally { message. and the on-premises audit log. objectState). and an action to execute if it fails after the specified number of retries. args.LastException). args. e => processError(e.sender.Dispose(). } }. . which was described in Chapters 4 and 5.” provides a high-level overview of the way that Trey Research implemented a retry mechanism to provide reliable processing for orders. The Retry Mechanism for Processing Orders Chapter 4. args) => TraceHelper.

It has the ConcurrencyMode property set to Fixed and the StoreGeneratedPatterns property set to Computed. Column OrderID ProcessStatus Description The foreign key that links the row to the related row in the Order table. or NULL if the order is not currently being processed. A value used internally by the Worker Role that indicates the status of the process. the Order table. “processed” and “error”. Possible values are “pending process”. The following table describes the columns in the OrderProcessStatus table.Id) or NULL if the order is not currently being processed. contains a summary of the customer's order without the individual order detail lines. it is necessary to examine the database tables and the code it uses in more detail. A value used to support optimistic locking by Entity Framework (EF) for rows in this table. You should be able to easily adapt the approach described to work in most scenarios and meet your own specific requirements. To understand how this mechanism is implemented. although it is tailored to the requirements of the Orders application. contains related rows that the retry mechanism uses to control the overall process. The UCT date and time when the current processing of the order will timeout. LockedBy LockedUntil Version .Figure 9 A high-level overview of the retry mechanism The implementation used here is reasonably generic. OrderProcessStatus. The first of these. The Retry Mechanism Database Tables The custom message sending and retry mechanism uses three tables in the database that relate to storing and processing orders placed by customers. The second table. It is automatically updated by EF when rows are inserted or updated.CurrentRoleInstance. The ID of the Worker Role that is processing the order (obtained from RoleEnvironment.

A value that indicates the most recent publicly visible status of the order. After a number of failed attempts. Timestamp The Retry Mechanism NewOrderJob Class When it starts. Possible values are “Order placed”. “Order received” (by transport partner). } The first of these job classes. The third table. Column OrderID Status Description The foreign key that links the row to the related row in the Order table. including the ServiceBusTopic class you saw in the previous section of this chapter. Each time a Worker Role locks a batch of orders that it will start processing it assigns the same BatchId to all of these so that it can extract them after the lock is applied in the database. The StatusUpdateJob class is described in the section “Receiving an Order Status Message in the Worker Role” in Chapter 6. A GUID that identifies this batch of orders being processed. transport partners. It is incremented each time a Worker Role attempts to process the order. and the on-premises audit log. The StatusUpdateJob class that is started by the Worker Role forms another part of the custom retry mechanism. and “Order shipped” (when the goods have been delivered). the Worker Role moves any existing messages into the dead-letter queue and raises an exception.” .Workers project) is executed from the class constructor. “Order sent to transport partner”. can contain one or more related rows for each order. a Worker Role creates an instance of the two classes it uses to interact with the Order database. to maintain history information existing rows are not updated. BatchId The sample application provided with the guide does not implement the RetryCount mechanism. }. new StatusUpdateJob().RetryCount The number of times that processing of the order has been attempted so far. C# private IEnumerable<IJob> CreateJobProcessors() { return new IJob[] { new NewOrderJob(). The following method in the WorkerRole class (defined in the Orders. Each row indicates the most recent publicly visible status of the order as displayed in the website page when customers view the list of their current orders. It receives acknowledgement messages from transport partners and updates the status in the database. Figure 10 shows a high-level view of the tasks that the NewOrderJob accomplishes and the ancillary classes it uses. NewOrderJob. New rows are added with a timestamp. The following table describes the columns in the OrderStatus table. OrderStatus. “Implementing Cross-Boundary Communication. implements the part of the custom retry mechanism (described later in this chapter) that sends messages to the Service Bus Topic and stores the results of the send process. The UCT date and time when this change to the status of the order occurred.

. Issuer = issuer. C# var serviceBusTopicDescription = new ServiceBusTopicDescription { Namespace = this. TopicName = topicName. The NewOrderJob class can now use the ServiceBusTopic instance to send the message. The following sections of this chapter describe the tasks shown earlier in Figure 10. DefaultKey = defaultKey }. The Worker Role monitors for tasks that have failed.Figure 10 The processing stages in the NewOrderJob class The NewOrderJob class first creates an instance of the ServiceBusTopicDescription class and populates it with the required values for the target Topic. this.newOrderMessageSender = new ServiceBusTopic(serviceBusTopicDescription).serviceBusNamespace. however. Before then. then uses this to create an instance of the ServiceBusTopic class named newOrderMessageSender. For details of how this is achieved. it must accomplish several other tasks connected with the retry mechanism. see the Run method of the WorkerRole class. and automatically restarts them.

" + "BatchId = {2} " + "WHERE ProcessStatus != 'processed' " + "AND (LockedUntil < {3} OR LockedBy IS NULL)". and then gets this batch of rows as a collection of OrderProcessStatus objects by calling methods in the ProcessStatusStore class (located in the DataStores folder of the Orders. DateTime.AddSeconds(320). C# public Guid LockOrders(string roleInstanceId) { using (var database = TreyResearchModelFactory. this.LockOrders( RoleEnvironment. This two-stage approach means that only a maximum of 32 orders that are waiting to be processed will be locked and returned. var commandText = "UPDATE TOP(32) OrderProcessStatus " + "SET LockedBy = {0}.Workers project). the NewOrderJob class locks a batch of rows in the Order table that it will process.processStatusStore. the code does not use database table locks. LockedUntil = {1}.NewGuid().ExecuteStoreCommand( commandText.CreateContext()) { var batchId = Guid.OrderProcessStatus> .ExecuteAction( () => database. } } The retry mechanism must “lock” orders that will be processed by this Worker Role so that other Worker Role instances cannot attempt to process the same orders. C# var batchId = this.sqlCommandRetryPolicy.Locking Orders for Processing After obtaining a reference to a ServiceBusTopic instance.Id. The LockOrders method in the ProcessStatusStore class locks the orders by executing a SQL statement that sets the values in the OrderProcessStatus table rows.UtcNow. C# public IEnumerable<Models.GetLockedOrders( RoleEnvironment. DateTime. var ordersToProcess = this. The GetLockedOrders method in the ProcessStatusStore class then queries the OrderProcessStatus table to get a collection of the rows that were successfully locked by the LockOrders method. It prevents more than one Worker Role from processing the same orders concurrently by setting and then reading values in the OrderProcessStatus table. However.Id).UtcNow)). batchId). batchId. It assigns the current worker role instance number and the batch ID to each one that has not already been processed and is not locked by another Worker Role instance.CurrentRoleInstance. return batchId. roleInstanceId.processStatusStore.CurrentRoleInstance.

} } Processing the Locked Orders The NewOrderJob class now iterates over the collection of locked orders to send a suitable message for each one to the Service Bus Topic.BatchId == batchId).Order { OrderId = op. C# foreach (var orderProcess in ordersToProcess) { if (orderProcess.Order.OrderId.LockedBy.Order. StringComparison.Phone. Order = new Models.Order. Country = op.GetLockedOrders(string roleInstanceId.PostalCode.Order. The following code shows how Trey Research creates the messages containing details of each new order. However. City = op.Order.Order.Where( o => o.OrderDate. Phone = op.OrdinalIgnoreCase) && o.State.Total } }).Order. Address = op.CreateContext()) { return this.OrderProcessStatus. ProcessStatus = op.OrderProcessStatus { LockedBy = op.Order.UtcNow) { // If this orderProcess expired.sqlCommandRetryPolicy. it must ignore any orders that have been locked and the lock has expired. Email = op. Guid batchId) { using (var database = TreyResearchModelFactory.City. LockedUntil = op.Country.LockedUntil < DateTime.ExecuteAction( () => database. PostalCode = op.Select( op => new Models.ProcessStatus.Order. OrderId = op.LockedBy.OrderId.ToList()). Total = op. ignore it and let // another Worker Role process it. .UserName. UserName = op.Equals(roleInstanceId.Order.Email. OrderDate = op. State = op.Address.LockedUntil.Order.

it saves the order to the database in SQL Azure and then posts one or more messages that contain the order details to a Service Bus Topic.Order.Order. OrderId = orderProcess. A summary of every order must be sent to the appropriate transport partner to advise the requirement for delivery.Order. } // Create the new order message var msg = new NewOrderMessage { MessageId = orderProcess. Amount = Convert. SessionId = orderProcess. ShippingAddress = orderProcess.000 are sent to an on-premises service to be stored in the audit log database. The Service Bus Topic that Trey Research uses in the Orders application is configured to filter messages based on the delivery location and the total value of the order. including the response that transport partners send at some point in the future when the goods have been delivered.. OrderDate = orderProcess.ToDouble(orderProcess. Details of all orders with a total value over $10.OrderId.Address.OrderDate.Total).OrderId..continue.UserName.ToString(). Implementing the Business Logic using a Service Bus Topic and Filters When a customer places an order in the Orders application. Figure 11 illustrates the message flow.UserName }. . CustomerName = orderProcess. The NewOrderJob class (the class described in the preceding sections of this chapter) is responsible for sending these messages to the Service Bus Topic.Order.Order. .Order.

Topics and Filters do not provide a general purpose workflow mechanism.transportPartnerStore . and even modify the properties of messages as they pass through the Topic.ToString(). C# .State). var brokeredMessage = new BrokeredMessage(msg) { CorrelationId = msg. However. It then creates a BrokeredMessage instance that contains the message to send. It also allows you to decouple senders and receivers by allowing a varying number of subscribers to listen to a Topic.Figure 11 Overview of the messages sent to and from transport partners and to the audit log service The use of Filters in a Service Bus Topic allows you to implement rudimentary business logic.. Properties = { { "TransportPartnerName". .. and adds the required properties to the BrokeredMessage instance.GetTransportPartnerName(orderProcess. add the required properties to the message so that the Service Bus Topic can filter them and post them to the appropriate subscribers. The following code shows how the NewOrderJob class uses a separate class named TransportPartnerStore to get the appropriate transport partner name. // Set values that are used for filtering subscriptions var transportPartnerName = this. therefore.OrderId. and you should not try to implement complex logic using Service Bus Topics.Order. each receiving only messages that are targeted to them by a Filter in the Topic. The NewOrderJob class must.

Send( brokeredMessage..transportPartnerName }. The code above also specifies the ReplyTo property of the message so that the receiver can send back the message that acknowledges receipt of this message to the appropriate Service Bus Queue. The TransportPartnerStore class is used to decide which transport partner a message should be sent to.Add("transportPartner". .Total } }. This allows the receiver to discover which namespace was used to send the message. This object must contain the order ID (so that the retry mechanism can update the correct order status rows in the OrderStatus table) and the name of the transport partner (so that it can be displayed when customers view their existing orders).. var objectState = new Dictionary<string. . objectState. Sending the New Order Messages The NewOrderJob class is now ready to send the order to the Service Bus Topic. { "ServiceBusNamespace".. { "OrderAmount". ReplyTo = this.newOrderMessageSender ... this. the Send method of the ServiceBusTopic class takes four parameters: the BrokeredMessage instance to send. orderProcess.. objectState. As you saw in the section “Sending Messages to a Service Bus Topic” earlier in this chapter.. var transportPartner . C# . as shown in the following code. so the NewOrderJob class must also assemble an object that can be passed as the state in the asynchronous method calls. It does this by calling the Send method of the ServiceBusTopic instance it created and saved in the variable named newOrderMessageSender.serviceBusNamespace }. this. the choice is based on the location of the order recipient. objectState. The NewOrderJob class defines lambda statements for the two actions as shown in the following code. var orderId = (Guid)objState["orderId"]. The message will be sent asynchronously. object>)obj.Add("orderId".OrderId). Notice that the code includes the Service Bus namespace as a property on the BrokeredMessage instance. C# . transportPartnerName). orderProcess. object>(). (obj) => { var objState = (IDictionary<string.. the asynchronous state object.Order.replyQueueName }. and two Action methods (one to execute after a message is sent and one to execute if sending fails).

processStatusStore.Connection. transportPartner).CreateContext()) { try { using (var t = new TransactionScope()) { // avoids the transaction being promoted. string transportPartner) { using (var database = TreyResearchModelFactory. and updating the ProcessStatus column value to “processed”. }.ExecuteAction( () => database.Dispose(). The second action extracts only the order ID from the asynchronous state object. (exception. after it sends a message. obj) => { var objState = (IDictionary<string. object>)obj. this.sqlConnectionRetryPolicy.SendComplete(orderId. this. It must update the Order table with the name of the transport partner that will deliver the order. with the correct timestamp. // Update the OrderProcessStatus table row var processStatus = . } The first action.Dispose(). Completing the Reliable Messaging Send Process The methods of the ProcessStatusStore class the NewOrderJob code calls. It must add a new row to the OrderStatus table to show the current status of the order (“Order sent to transport partner”). and passes it and the current Exception instance to the UpdateWithError method instance.= (string)objState["transportPartner"]. orderId). this. extracts the order ID and transport partner name from the asynchronous state object.Open()). and passes these to the SendComplete method of the ProcessStatusStore class instance that the NewOrderJob class is using.UpdateWithError( exception. }).processStatusStore. The SendComplete method is the more complex as it needs to update three tables:  It must update the matching row in the OrderProcessStatus table to unlock it by setting the values of the LockedBy and LockedUntilcolumns to null. brokeredMessage.   C# public void SendComplete(Guid orderId. which is executed after the message has been sent successfully. var orderId = (Guid)objState["orderId"]. brokeredMessage. are shown next.

// Add a new row to the OrderStatus table var status = new OrderStatus { OrderId = orderId. this. processStatus. Status = "Order sent to transport partner".ExecuteAction( () => database.SingleOrDefault( o => o.OrderId == orderId)).this.SingleOrDefault( o => o. It need only update the matching OrderProcessStatus table row with the value “error” and set the values of the LockedBy and LockedUntil columns to null. database.OrderProcessStatus .SaveChanges()).CreateContext()) { // Update the OrderProcessStatus table row var processStatus = this. } } } The UpdateWithError method is much simpler.TransportPartner = transportPartner.ExecuteAction( () => database.OrderStatus.SingleOrDefault( o => o.. ..OrderId == orderId)).SaveChanges()). Timestamp = DateTime.OrderId == orderId)). processStatus.ExecuteAction( () => database. order.sqlCommandRetryPolicy.sqlCommandRetryPolicy.UtcNow }.Order. Guid orderId) { using (var database = TreyResearchModelFactory.OrderProcessStatus.ExecuteAction( () => database.AddObject(status).ProcessStatus = "error".LockedBy = null.ProcessStatus = "processed".SaveChanges()).sqlCommandRetryPolicy.sqlCommandRetryPolicy.LockedUntil = null. } } catch (UpdateException ex) { . processStatus.ExecuteAction( () => database.Complete(). C# public void UpdateWithError(Exception exception. processStatus. this.sqlCommandRetryPolicy. this.ExecuteAction( () => database. t. // Update the Order table row var order = this.sqlCommandRetryPolicy.

ExecuteAction( () => database. However. this. In the Orders application. The ServiceBusSubscriptionDescription class defines the properties for a subscription. credential issuer name.sqlCommandRetryPolicy. Trey Research could add additional subscribers that listen for messages and pass them to auditing or other types of services.description = description. C# public ServiceBusSubscription( ServiceBusSubscriptionDescription description) { this.CreateServiceUri("sb". The ServiceBusSubscription and ServiceBusSubscriptionDescription Classes Trey Research implemented two custom classes (located in the Communication folder of the Orders. . and the default key. Trey Research can add new transport partners and arrange to send messages to these new partners simply by editing the Filter criteria in the Topic. as shown in the following code.Namespace. var runtimeUri = ServiceBusEnvironment. Notice how it also creates a MesageReceiver for the Topic subscription.description.description.DefaultKey). Trey Research uses a single Service Bus Topic for each deployed instance of the application (in other words.Shared project) for creating subscriptions. processStatus. When the transport partner sends back an acknowledgement message to indicate that the message was received. For example. You will see how this works in the remaining part of this chapter. this.SaveChanges()). } } This completes the process of sending the order message and updating the database tables.tokenProvider = TokenProvider. and the number of subscribers is independent of the Topic. Subscribing to a Service Bus Topic One of the major advantages of using Service Bus Topics to distribute messages is that it provides a level of decoupling between the sender and receivers.processStatus. You saw parts of this mechanism in the previous sections of this chapter. receivers must subscribe to receive messages.LockedBy = null.description. The sender can construct a message with additional properties that Filters within the Topic uses to redirect the message to specific receivers. subscription name. the retry mechanism can complete the cycle by updating the database tables again.LockedUntil = null. the Topic name. and receive messages destined for them based on the Filter rules Trey Research establishes for the order value and choice of transport partner. this. All of the transport partners and the onpremises audit log service subscribe to all of these Topics. New receivers can subscribe to a Topic and receive messages that match the appropriate filtering criteria. this.Issuer.CreateSharedSecretTokenProvider( this. The constructor of the ServiceBusSubscription class accepts a populated instance of the ServiceBusSubscriptionDescription class and generates a subscription to the specified Topic. there is one Topic per datacenter).

ToLowerInvariant() + "/subscriptions/" + this. .serviceBusSubscriptionDescription = new ServiceBusSubscriptionDescription { TopicName = topicName.string. .receiver = messagingFactory. . this.serviceBusSubscriptionDescription.Create(runtimeUri. } The ServiceBusSubscription class is used by the transport partners and the on-premises audit log service to connect to the Service Bus Topic exposed by the Orders application. DefaultKey = key }.. var serviceBusSubscription = new ServiceBusSubscription( this. C# . The constructor of the abstract OrderProcessor class (in the Connectivity folder of the TransportPartner project) creates and populates an instance of the ServiceBusSubscriptionDescription class.FromSeconds(2) }. SubscriptionName = subscriptionName. this. creating a subscription for each one as shown in the following extract..ToLowerInvariant(). var receiverHandler = new ServiceBusReceiverHandler<NewOrderMessage> (serviceBusSubscription. and so the code must iterate over all of these.Namespace = serviceBusNamespace. Issuer = transportPartnerName.. The code also calls the GetReceiver method of its ServiceBusSubscription instance.PeekLock).. this.description.. C# foreach (var serviceBusNamespace in serviceBusNamespaces) { this.serviceBusSubscriptionDescription). ReceiveMode. which simply returns a reference to the MessageReceiver object for that Topic. var messagingFactory = MessagingFactory.CreateMessageReceiver( this.. A list of the Service Bus namespaces for these datacenters is stored in configuration for each transport partner. The OrderProcessor class must be able to receive messages from more than one Topic if there are multiple instances of the Order application running in more than one datacenter.TopicName.tokenProvider).Empty).SubscriptionName .GetReceiver()) { MessagePollingInterval = TimeSpan.description.

token) => { return Task.Factory. C# . However.ProcessMessage(message. the OrderProcessor class calls the ProcessMessages method of the ReceiverHandler instance it just retrieved and passes to it a lambda expression that will be executed when a message is received. Receiving a Message from a Topic To receive new order messages from each Topic. }.Token). defined in the OrderProcessor class.tokenSource. In the real world. this.The message polling interval you specify for receiving messages from a Queue or Topic must take into account variables specific to your environment (such as CPU processing power) and the expected volume of work (such as the number of orders to process and number of worker role instances). The following code is the virtual method that sends the acknowledgement message by calling the SendOrderReceived method.ProcessMessages( (message.. The example code simulates this by using Windows Forms applications.” Trey Research implemented two concrete classes named Connector and Adapter that inherit the abstract OrderProcessor class and override the ProcessOrder method. see the section “Interacting with Partner-specific Services” in Chapter 4. For a more detailed discussion of the architecture for communicating with external partners. . This expression executes the ProcessMessage method for each received message.Token. context). for simplicity in the example application. It also sends the tracking ID generated by the transport provider's system back to the Orders application in the message that acknowledges receipt of the “new order” message. to send the message. “Deploying Functionality and Data to the Cloud. TaskCreationOptions. queueDescription. queueDescription).StartNew( () => this. the code that processes received messages will interact with the system for that partner using the partner's own specific mechanism and interface.. the methods simply display the order in the Windows Forms application and generate new random tracking numbers. C# protected virtual void ProcessMessage( NewOrderMessage message.tokenSource. The SendOrderReceived method uses another method. } The ProcessMessage method. calls another routine named ProcessOrder that submits the order to the transport provider's system. this.None. named SendToUpdateStatusQueue. receiverHandler.

ProcessOrder(message. see the section “Receiving an Order Message and Sending an Order Status Message in a Transport Partner” in Chapter 6. and sends messages to a Service Bus Queue. You can examine the relevant code in the AuditController class (located in the Controllers folder of the HeadOffice project). including implementing a custom retry mechanism to ensure reliable message transmission and processing of customer orders.SendOrderReceived(message. For more information about how the Orders application receives messages using a ReceiverHandler instance. An application can post messages to a Topic and include metadata that a filter can use to determine which Subscriptions to route the message through. Summary This chapter describes how to use Service Bus Topics and Subscriptions to intelligently route messages to services. the challenges that they help you to overcome. the chapter provides guidelines on using Service Bus Topics and Subscriptions in your own applications. it updates the Order table to include the message text.TransportPartnerDisplayName). This simple but powerful mechanism enables you to address a variety of scenarios and easily construct elegant solutions for these scenarios. Finally. . When a Worker Role in the Orders application receives this message. trackingId). this.ServiceBusQueueDescription queueDescription) { var trackingId = this. Services listening on these Subscriptions then receive all matching messages. this. statusMessage. As well as a description of the use cases for Service Bus Topics and Subscriptions. the chapter describes how Trey Research used Service Bus Topics and Subscriptions to add rudimentary business logic and message routing capabilities to their application. The way it does this is very similar to the approach you have already seen in this chapter for transport partners. “Implementing Cross-Boundary Communication.” The audit log service in the on-premises management application also subscribes to the Service Bus Topic exposed by the Orders application.Format( "{0}: Order Received". queueDescription. and the cross-cutting concerns related to this technology.Empty) { var statusMessage = string. and adds a row to the OrderStatus table to indicate that the transport partner received the message. if (trackingId != Guid. } } Completing the Reliable Messaging Retry Process The message sent by the transport partner contains the tracking ID and the text “Order Received”. queueDescription).

com/b/tomholl/archive/2011/10/09/ Topics.More Information   “ “Using Service Bus Topics and Subscriptions with WCF” at “Service Bus REST API Reference” at . "Introduction" of this guide.aspx   For links and information about the technologies discussed in this chapter see Chapter 1.aspx "How to Simplify & Scale Inter-Role Communication Using Windows Azure Service Bus" at http://windowsazurecat. and Subscriptions” at http://msdn.

Modern database management systems. if a service running in a datacenter uses data held in a local database stored in the same datacenter then it has eliminated the latency and reliability issues associated with sending and receiving messages across the Internet. to support highlyscalable data storage that reduces the need to install and maintain expensive hardware within an organization’s data center. thereby improving the response time of these applications and services. this aspect employs a database to act as a reliable repository. solutions such as SQL Azure provide a cloud-based database management system that implements many of the same features. In these cases. it is necessary to ensure that all instances of the database contain the same data. you may choose to use a different database management system or implement a different form of data source. This chapter examines the issues concerned with distributing and replicating data the between services running in the cloud and across the cloud/on-premises divide by using Azure technologies. using a single instance of SQL Azure in isolation may not be sufficient to ensure a good response time. Using SQL Azure. In this scenario. Use Cases and Challenges A primary motivation for replicating data in a hybrid cloud-based solution is to reduce the network latency of data access by keeping data close to the applications and services that use it. which helps to minimize the network latency frequently associated with remote database access. Additionally. However. an organization might decide to maintain a copy of the database at each location. Instead. Distributing. For many systems. As described previously. This can be a non-trivial task. you can deploy a database to the same datacenter hosting the cloud-based applications and services that use it. However. It describes some possible solutions that an organization can implement to keep the data sources synchronized. you may decide to store data in a different repository. in a hybrid system that spans applications running in a variety of distributed locations. provide multi-user access capable of handling many thousands of concurrent connections if the appropriate hardware and network bandwidth is available. for example.8 Replicating. and Synchronizing Data All but the most trivial of applications have a requirement to store and retrieve data. and ensure that any changes to this data are correctly synchronized across all datacenters. you may need to implement your own custom strategies to synchronize the data that they contain. However. especially if the data is volatile. . these benefits are not necessarily cost-free as you must copy data held in a local database to other datacenters. such as Microsoft SQL Server.

and these updates must be applied in a consistent manner. when a user actually submits an order. Choosing an appropriate replication topology can have a major impact on how you address these problems. When you are considering replicating data. and make any necessary updates. In this scenario. after an update. the implementation of this use case copies all data in a data source to all other instances of the same data source. The number displayed is likely to be inaccurate if a product is popular. implementation. Replication is the process of copying data. the same modification must be made to all other copies of that data and this process may take some time. incorporating replication and managing eventual consistency is likely to introduce complexity into the design. applications running on-premises and services running in the cloud may be able to query and modify any data. When you modify data. They connect to the most local instance of the data source. . possibly minutes or even hours later. Depending on the strategy you choose to implement. replication is best suited to situations where data changes relatively infrequently or the application logic can cope with out-of-date information as long as it is eventually updated. In its simplest form. these updates must be transmitted to all other instances of the data source. Replicating Data across Data Sources in the Cloud and On-premises Description: Data must be positioned close to the application logic that uses it. and management of your system.You must also consider that replicating data can introduce inconsistencies. In this case. the stock level should be checked again. and only releasing this lock when the update has been successfully applied across all instances. in a globally distributed system such an approach is infeasible due to the inherent latency of the Internet. in an application that enables customers to place orders for products. Figure 1 illustrates this topology. At some point. referred to as “Topology A” throughout this chapter. whether these data sources are located in the cloud or on-premises. distributing. Consequently. there are two main issues that you need to focus on:   Which replication topology should you use? Which synchronization strategy should you implement? The selection of the replication topology depends on how and where the data is accessed. perform queries. For example. Fully transactional systems implement procedures that lock all copies of a data item before changing them. whether this logic is running at a datacenter in the cloud or on-premises. and the problems associated with replication are those of managing and maintaining multiple copies of the same information. the application may display the current stock level of each product. so most systems that implement replication update each site individually. However. and if necessary the customer can be alerted that there may be a delay in shipping if there are none left. different sites may see different data but the system becomes “eventually consistent” as the synchronization process ripples the data updates out across all sites. while the synchronization strategy is governed by the requirements for keeping data up-to-date across replicas. and synchronizing data and summarize the key challenges that each use case presents. other concurrent users may be placing orders for the same product. The following sections describe some common use cases for replicating.

Figure 1 Topology A: Database located on premises with replicas in the cloud If your principal reason for moving data services to the cloud is purely for scalability and availability reasons you might conclude that the data sources should just be removed from your on-premises servers. and duplicated across all datacenters. . relocated to the cloud. Such a strategy might be useful if the bulk of the application logic that accesses the data has previously been migrated to the cloud. the only difference is that there is no data source located on-premises. as described in Topology A still apply. Figure 2 shows this scenario. The same concerns surrounding updating data and propagating these changes consistently.

For example. This approach reduces the need to copy potentially large amounts of data that is rarely used by a datacenter at the cost of the developing the additional logic required in the code for the service to determine the location of the data. Chapter 9. if a service regularly requires query access to non-local data or makes a large number of updates then the advantages of adopting this strategy over simply accessing the onpremises data source for every request are reduced. consider a stock control system for an organization that maintains several warehouses at different geographical locations. Additionally. “Maximizing Scalability. The on-premises database effectively acts as a master repository for the entire system. such blanket strategies might not always be appropriate. while the databases running at each datacenter act as a cache holding just the local data for that datacenter. In this case. Availability. . especially if the data naturally partitions itself according to the location of the services that most use it. otherwise there is a risk of losing data updates (services running at two datacenters might attempt to update the same item). it can arrange to make the same change to the on-premises database. This approach also assumes that each item in a data source is managed exclusively by one and only one datacenter. A service running at a datacenter in the same geographical region as a warehouse might be responsible for managing only the data in that warehouse. while retaining a copy of the entire database on-premises. When a service modifies data in its local database. and Performance” describes additional ways to implement a cache by using the Windows Azure . If a service should need to access data held elsewhere.Figure 2 Topology B: Database replicas in the cloud but no database on-premises Although Topology A and Topology B are simple to understand. it may be sensible to replicate just the data pertaining to that warehouse to the datacenter hosting the corresponding service instance.NET Libraries Caching service. it can query the on-premises database for this information.

Additionally. also hosted on-premises. This scheme keeps the logic for the services in the cloud relatively straightforward (compared to Topology C) at the expense of providing an on-premises application to manage updates. if each data item is not exclusively managed by a specific datacenter. with the on-premises database acting as a master repository In a variation on this scenario. and then arrange for the corresponding changes to be copied out to the appropriate databases running in the cloud.Figure 3 Topology C: Local cloud databases. it can send a request to the on-premises application which is designed to listen for and handle such requests. and refer updates to an application running on-premises. shown in Figure 4. also ensures that all data sources in the cloud are eventually consistent by virtue of the replication process from the master data source. . services running in the cloud primarily query the data. The on-premises application can modify the information in the master data source. if a service running in the cloud needs to modify data. This scheme.

with the on-premises database acting as a master repository accessed through an on-premises service In a simple variant of Topology D. In either case. but replicates this data between sites through the cloud. applications running at each branch office access the local data source hosted at the same site. or just the subset that relates to the branch office. the application may be performing some data maintenance tasks under the direct control of a user at the host organization. the cloud simply acts as a conduit for passing data between sites. A copy of the data source in the cloud through which all updates pass can act as the replication hub which gathers all updates and redistributes them. You can apply this technique to situations such as remote branch offices which may require either a complete copy of the data. the updated data is simply published to each of the databases in the cloud. . and any updates are propagated through the cloud.Figure 4 Topology D: Local cloud databases. In this scenario. In this case. the application running on-premises updates the master database of its own volition rather than in response to requests from services in the cloud. The organization retains the data and the applications that use it on-premises. as well as performing the role of a master repository if a branch office requires access to non-local data. A final use case concerns an organization spread across multiple sites not in the cloud.

In determining your synchronization strategy you should consider the following questions:  What data do you need to synchronize and how will you handle synchronization conflicts? The answer to this question depends largely on when and where the data is updated as described by the topologies listed in the previous section. Data changes. as descried earlier in this chapter. In a replicated environment you must ensure that all such changes are propagated to all appropriate instances of a data source. update. in Topology D the master data source is only modified by applications running on-premises. and delete records.Figure 5 Topology E: Replicating head office and branch office databases through the cloud Synchronizing Data across Data Sources Description: Applications and services modify data. . For example. Applications inevitably insert. This is a one-way operation (onpremises out to the cloud) with little or no possibility of synchronization conflicts. it is rarely entirely static. the on-premises master database holds the definitive copy of the data. Synchronizing data can be expensive in terms of network bandwidth requirements. overwriting data held by datacenters in the cloud. and it may be necessary to implement the synchronization process as a periodic task that performs a batch of updates. Therefore you must be prepared to balance the requirements for data consistency against the costs of performing synchronization and ensure that your business logic is designed with eventual rather than absolute consistency in mind. so synchronization will be a matter of copying the changes made on-premises to each datacenter in the cloud. and these modifications must be propagated across all instances of the database.

update. if the data is partitioned by datacenter again there is no possibility of conflicts (services at two different datacenters will not update the same data) and the synchronization process is effectively one-way. To minimize the time taken and resources required to synchronize databases. reliability. if each service running in the cloud performs a large number of updates then maintaining a replica in the cloud becomes an overhead rather than a performance asset. whether it is a SQL Azure database or some other repository. Full multi-way synchronization between replicated relational databases can be a resourceintensive operation. This requirement applies during the synchronization process as well as during the regular data access cycle. The following sections provide a summary of the possible issues. the latency associated with transmitting and applying large numbers of updates across a number of sites may mean that some of the data held in one or more databases is inconsistent until all the sites are synchronized. you should carefully consider which data your applications and services need to replicate. then replication might not be the most appropriate solution as synchronization will necessarily be a continuous process. using a single centralized data source is a better choice than implementing replication. so there is a possibility that conflicting changes can occur. Additionally. The data is not partitioned. whether your applications and services can live with potentially stale data. In this case. Where applications and services may modify any data (Topologies A. The network packets containing the data being replicated must . and performance. does your system depend on complete transactional integrity all of the time? If so. and whether any data should be read-only at each site. and delete operation may generate a lot of network traffic and consume considerable processing power. the datacenters in the cloud hold the definitive data and overwrite the data held on-premises. and you must define a strategy for handling this situation. Data Access Security Each data source.  What are the expected synchronization data volumes? If there is a large amount of volatile data then replicating the effects of every insert. In this case. in Topology C. System requirements and application design can also have a significant bearing on how well your chosen replication approach functions. must protect the data that it contains to prevent unauthorized access. For example. impacting the performance of each data source. and E) the synchronization process is multi-way as each database must be synchronized with every other database. and instant.Where data is modified by services in the cloud but not by applications running on-premises (Topology C).  Cross-Cutting Concerns Effective data replication has a high level of dependency on the network in terms of security. When do you need to synchronize data? Does every instance of the database have to be fully up-todate all of the time. and possibly nullifying the reason for replicating data in the first place. B. circulating all changes immediately that they occur and locking resources in each data source while it does so to prevent inconsistency from occurring.

the data comprising the order should not be lost and the order process must be completed. The Sync Framework SDK is also useful for scenarios where you need to implement a custom synchronization approach that you cannot configure easily by using the Azure Management Portal. adversely affecting the responsiveness of the system from the users' perspective. and as long as information is not lost and eventually becomes consistent then they should be satisfied. users want to be able to use the system quickly. it should be possible to restart this process without losing or duplicating any data.also be protected as a security breach at this point could easily propagate corrupted information to multiple instances of your precious data. so you can use the Sync Framework SDK to implement a custom synchronization strategy and incorporate additional validation logic if necessary. any solution that replicates data between databases must implement a reliable mechanism for transporting and processing this data. Integrity and Reliability Even in a solution where immediate data consistency is not a critical requirement the system must still update data in a reliable manner to preserve the integrity of the information the application presents. SQL Azure Data Sync is compatible with the Microsoft Sync Framework 2. For example. you can configure replication and manage synchronization between these databases and SQL Server databases running on-premises by using SQL Azure Data Sync. As mentioned elsewhere in this chapter. This approach depends on application logic locking data items and their replicas before changing them and then releasing the locks. This technology is a cloud-based synchronization service based on the Microsoft Sync Framework. Azure and Related Technologies If you are implementing databases in the cloud using SQL Azure. then you must take steps to prevent competing applications from accessing data that may be in a state of flux somewhere in the system.1. Data Consistency and Application Responsiveness Data consistency and application responsiveness are conflicting requirements that you must balance to address the needs of your users. in the orders application cited earlier. If some aspect of the system handling this the synchronization process fails. Don’t forget that the network is a critical component that impacts reliability. A reliable solution is resilient in the face of network failure. no other applications can access it. Using the Azure Management Portal you can quickly configure synchronization for the most common scenarios between your on-premises SQL Server databases and SQL Azure databases running in the cloud. While the data is locked. in many distributed systems immediate and absolute consistency may not be as important as maintaining application responsiveness. Therefore. SQL Azure Data Sync is compatible with SQL Server 2005 Service Pack 2 and later. Additionally. you can . For example. If you require a high level of consistency across all replicated databases. if the system accepts an order from a customer then that order should be fulfilled.

This section provides guidance for defining the key elements of a synchronization architecture. with sender and listener applications applying the logic to publish updates and synchronize them with the exiting data. Defining a Sync Group and Sync Dataset SQL Azure Data Sync organizes the synchronization process by defining a Sync Group. together with a hub database that acts as a central synchronization point. and the location of the databases holding this data. You can specify how frequently the synchronization process occurs. Comprehensive logging features. SQL Azure Data Sync logs all events and operations. The portal provides a graphical view of your topology and its current health status through the Sync Group dashboard. Guidelines for Configuring SQL Azure Data Sync When you configure SQL Azure Data Sync. enabling you to quickly determine the cause of any problems and take the necessary corrective action. the Sync Framework SDK. you must make a number of decisions concerning the definition of the data that you want to your own synchronization services if you need to synchronize data between databases located onpremises and mobile devices for roaming users. Another approach is to implement a custom mechanism that passes updates between databases using messaging. The SQL Azure Data Sync service runs in the cloud and scales automatically as the amount of data and number of sites participating in the synchronization process increases. and includes guidance on using SQL Azure Data Sync in a number of common scenarios. All member databases participating in a topology synchronize through the hub. and you can easily modify this frequency even after synchronization has been configured. including:   Elastic scalability. SQL Azure Data Sync enables you to select how to resolve any conflicts detected during synchronization by selecting from a set of built-in conflict resolution policies. Preconfigured conflict handling policies. Using the Azure Management Portal you can also force an immediate synchronization. . Service Bus Topics and Subscriptions provide an ideal infrastructure for implementing this scenario. A sync group is a collection of member databases that need to be synchronized. The portal provides wizards that step through the configuration process and enable you to specify the data to be synchronized. You can use the Azure Management Portal to define the synchronization topology. The following sections provide more information on how to use SQL Azure Data Sync. they send local updates to the hub and receive updates made by other databases from the hub. Scheduled synchronization. The Azure Management Portal enables you to examine this information. Replicating and Synchronizing Data Using SQL Azure Data Sync Using SQL Azure Data Sync to implement SQL Server synchronization provides many benefits. You can also indicate whether replication should be one-way or bi-directional. Simple configuration. and Service Bus Topics and Subscriptions for implementing replication in some common scenarios.    The following sections provide information about the way in which SQL Azure Data Sync operates. and filter it in a variety of ways.

and you can define filters to restrict the rows that are synchronized. these triggers can cause problems with some Object/Relational Management frameworks that check the number of rows affected after an operation is executed against the database. and add an index to support the primary key of each table.aspx. the schema of the data that you want to replicate may already exist in an on-premises or SQL Azure database. every table that participates in a sync dataset must have a primary key otherwise synchronization will fail.technet. Additionally. based on the definition of the sync dataset. and the tables that underpin the sync dataset will be automatically added to subsequent member databases when they are enrolled in the sync once you have defined the sync dataset for a sync group you cannot modify the definition of this dataset. and this may have an impact on the performance of queries performed against a replicated database. While the deployment process does a reasonable job of replicating the schema for the sync dataset. you must drop the sync group build a new one with the new sync dataset. it is good practice to create a SQL script containing the commands necessary to create each table to be . For more information about the triggers that SQL Azure Data Sync generates. It is important to understand that SQL Azure Data Sync imposes some constraints on the column types in the tables that participate in the synchronization process. to ensure complete accuracy and avoid any unexpected results. see “Considerations for Using Azure Data Sync” at http://sqlcat. the deployment process will only generate the columns specified by the sync dataset. not just SQL Server.When you define a sync group. and so the types it supports are limited to those common across the major database management systems. These constraints are due to the Sync Framework on which SQL Azure Data Sync is based. rows. For example. Implementing the Database Schema for Member Databases In a typical scenario. In this case. these columns will be ignored and the data that they contain will not be replicated. If you attempt to create a sync dataset that includes columns with unsupported types. The same sync dataset applies globally across all member databases in the sync group. You define the sync dataset when you add the first member database to the sync group. or CLR types. You do not have to select every table or column in a database. spatial data types. However. the Sync Framework is designed to operate with a variety of database management systems. However. any indexes other than that for the primary key will not be generated. When you deploy a sync group. due to the differences between SQL Azure and SQL Server it may not always be identical. Therefore. and columns to synchronize. if the necessary tables do not already exist in the other member databases or the hub then SQL Azure Data Sync will automatically create them. you also define a Sync Dataset specifying the tables.aspx. These triggers track the changes made to the data in each table in the sync group. you cannot synchronize columns based on userdefined data types. SQL Azure Data Sync creates triggers on each table in a sync For a full list of supported and unsupported types see “SQL Azure Data Sync – Supported SQL Azure Data Types” at http://social.

these batch transactions do not necessarily reflect the business transactions performed by your system. You can also define any views and stored procedures that each member database may require as these cannot be replicated automatically. Any updates previously applied to the hub from another member database are transmitted to the database and applied. the default batch size is 5000 rows. SQL Azure Data Sync uses a custom provider (SqlAzureSyncProvider) to interact with the Microsoft Sync Framework synchronization engine. In effect. For example. The synchronization provider makes efficient use of network bandwidth by batching updates together. The hub is the focus for detecting and resolving conflicts. Databases visited earlier will not incorporate the changes resulting from the synchronization with databases visited later. you need to perform two synchronizations across the sync group. You can then run this script against each database in turn before provisioning replication. either all the updates in a batch are applied successfully or they are rolled back. For member databases to be fully synchronized with each other. each batch of updates is applied as a transaction. a business transaction that modifies data in two tables may have these updates propagated by different batches when these changes are synchronized. This provider has been written especially for SQL Azure and has been optimized to reduce the number of network round-trips that can occur when synchronizing data. this means that the first member database to synchronize with the hub predominates. If a row undergoes several updates between synchronizations. then overwrite changes to this data made at the member database with the data at the hub. Managing Synchronization Conflicts During the synchronization process. views.replicated. Important: The synchronization process visits each member database in turn in a serial manner and applies the necessary updates to synchronize that member database and the hub. Additionally. the provider cooperates with the SQL Azure throttling mechanism and will automatically reduce this batch size if necessary. Use a tool such as Microsoft SQL Server Management Studio to generate and edit the SQL scripts that create the tables. only the final update will be replicated. SQL Azure Data Sync does not keep a log of every change made between synchronizations. . During synchronization. SQL Azure Data Sync enables you to select from two conflict resolution policies:  Hub Wins. If you have the appropriate credentials. each synchronization applies only the changes in effect at that time to each database. using SQL Server Management Studio you can also connect to each member database and run these scripts. However. If the data at the hub has already been changed. and stored procedures for each member database. However. SQL Azure Data Sync connects to each member database in turn to retrieve the updates performed in that database and applies them to the hub. together with the appropriate indexes.

Additionally. In contrast to the Hub Wins policy. you can divide your replication topology into a series of sync groups with each sync group containing the hub and a single member database. Remember that the primary purpose of replication is to propagate updates made at one site to all other sites so that they all have the same view of the data. If the data has been changed at the member database. The sync group for a high priority member database with updates that must always take precedence can select the Client Wins conflict resolution policy so that these changes are always replicated. in this case the last member database to synchronize with the hub predominates. The policy for other sync groups can be set to Hub Wins. you specify this policy when you first create the sync group and you cannot change it without dropping and recreating the sync group. For more information. For example you can place several member databases into the Hub Wins sync group if none of these databases are likely to contain changes that conflict with each other. you should design your solution to minimize the chances of conflict from occurring. Ideally. Only one of these rows will win and the rest will be removed. To reduce the chances of conflict from occurring you can configure one-way replication and specify the synchronization direction for each member database in a sync group relative to the hub. For example. see the section “Selecting the Synchronization Direction for a Database” later in this chapter. for example. several of these rows might be given the same primary key value and will collide when the tables are synchronized. Pay careful attention to the definition of the columns in replicated tables as this can have a significant impact on the likelihood of conflict. this change overwrites any previous changes to this data at the hub. and in this way the changes made at the high priority database will always be replicated out to the other member databases. this data represented orders for different customers. The results could be disastrous if. such as a GUID. do not use columns with automatically generated key values in replicated tables. you will have lost the details of all the orders except for the winner selected by the conflict resolution policy! To avoid situations such as this. Conflict is typically a result of bi-directional synchronization. then SQL Server will automatically generate values for this column in a monotonic increasing sequence. in a typical distributed scenario. The synchronization schedule for each sync group determines the order in which each member database is synchronized with the hub. Client Wins. Note that although you can select the conflict resolution policy. applications running at different sites tend to manage their own subset of an organization’s data so the chances of conflict are reduced. typically starting at 1 and incrementing by 1 for each newly inserted row. If rows are added at multiple member databases in a sync group. If you need to guarantee the effects of the conflict resolution policy. . you cannot currently influence the order in which databases are synchronized with the hub. You should give the conflict resolution policy careful thought as the same policy applies across all databases in a sync group. You can implement many variations on this topology. but instead use a value that is guaranteed to be unique. if you define the primary key column of a replicated table with the SQL Server IDENTITY attribute.

However. you can specify a simple synchronization schedule for a sync group. while if it is too short a previous synchronization might not have completed and the attempt will fail. After synchronizing with all the member databases. then member databases may contain outdated information for an extended period.Locating and Sizing the Hub Database The hub must be a SQL Azure database. You may need to determine . whether these databases are located on-premises or in the cloud. you may opt to designate the SQL Azure database for the most active site as the hub. and you can specify the region in which to run this server. Apart from acting as the focus around which the synchronization process revolves. the time taken to complete the synchronization process will be dependent on location of the hub database. Specifying the Synchronization Schedule for a Sync Group Synchronization is a periodic process rather than a continuous operation. You must strike a balance between the overhead associated with a database being the hub of a sync group against the time required to synchronize this database with a hub located elsewhere. and the work involved in performing the synchronization operations may impact the performance of this database. You create the hub database manually. If you set a synchronization schedule. For example. it holds the definitive and most up-to-date version of the data. as the synchronization period increases the more likely it is that conflicts will occur. which will increase the time taken still further. You can insert. SQL Azure Data Sync creates additional metadata tables in your databases to track the changes made. every other member database in the sync group will periodically synchronize with this database. you should store it at a datacenter that is geographically closest to the most active member databases. Additionally. and delete data in this database and these changes will be replicated throughout the sync group. It will also depend on the volume of data to be synchronized. so if you make the hub database too small synchronization could fail. you must select a synchronization frequency that is appropriate to your solution. As described previously. the hub contains exactly the same data as any other SQL Azure member database in the sync group. SQL Azure Data Sync uses a Data Sync Server to replicate and synchronize data between your databases through the hub. and you can also force synchronization to occur manually by using the Azure Management Portal. if it is too long. SQL Azure does not currently support automatic growth for databases. In some situations. you can elect to use one of the SQL Azure databases originally envisaged as a member of the sync group as the hub. and you must take these tables into account when sizing the hub database. You should also note that when you configure synchronization. and it should be at least as big as the largest of the member databases. Ideally. This will help to reduce the network latency associated with transmitting potentially large amounts of data across the Internet. you should locate this server in the same region that you use plan on using for hosting the hub database. You can provision a single Data Sync Server for each Azure subscription that you own. This strategy can help to minimize the network latency and thereby improve the performance of the synchronization process. update. and the synchronization process will have to expend effort resolving these conflicts. the longer the interval between synchronizations the more data will need to be synchronized and transmitted to and from the hub. If your databases are distributed evenly around the world and the volume of database update and query traffic is roughly the same for each one. especially if the tables in the sync dataset contain a complex collection of indexes. then you should position the hub at the datacenter closest to your highest priority sites. The location of this database is key to maintaining the performance of the synchronization process.

Therefore. Synchronization can be:  Bi-directional. so you can observe the effects of synchronizing data at different intervals and then select the period most appropriate to your requirements. but it will not receive changes from the hub. any changes made locally will have already been made to the on-premises database by the services running in the cloud. The member database can make changes and upload them to the hub. In the diagrams in this section. so the only changes that need to be replicated are those originating from other services located elsewhere which have also updated the on-premises database. it is also possible that any of the SQL Azure member databases might take . This form of synchronization is useful for situations such as Topology D described earlier in this chapter. The member database can receive changes from the hub. When synchronization occurs the changes made by each service in the cloud can be propagated out to the member databases for the other services via the hub. you specify the synchronization direction. As with the conflict resolution policy. you can modify this schedule at any time. This is likely to be the most common form of synchronization implemented by many organizations. you should only synchronize tables that are stable or possess data that doesn't need to be refreshed frequently on a daily or weekly schedule. and it can also receive updates from the hub. Again. this form of synchronization is useful for implementing scenarios similar to Topology D. the synchronization schedule applies globally across all databases in the sync group. Figure 6 depicts an updated version of Topology D with the hub database and Data Sync Service required by SQL Azure Data Sync. From the hub. the member databases can be configured to synchronize from the hub. The onpremises database can be configured to synchronize to the hub. you should also consider that SQL Azure charges are applied to data that is moved in and out of SQL Azure datacenters. The member database can make changes and upload them to the hub. Finally. each database in a sync group can specify a different synchronization direction. the hub is shown as a separate SQL Azure database hosted in a separate datacenter. However. but will not upload any local changes to the hub. As described earlier. Selecting the Synchronization Direction for a Database When you add a member database to a sync group. A service running in the cloud updates the local member database and also copies changes to the database running on-premises as they occur.the optimal synchronization period empirically. In this case. and tune it accordingly as you establish the data timeliness requirements of your applications and services. To the hub.   The synchronization direction is an attribute of each member database. the on-premises database does not need to be synchronized as it already contains all the updates.

aspx. A sync loop occurs when the synchronization process in one sync group results in synchronization being required in another sync group. in which case a separate hub database is not necessary. and when this second synchronization occurs the configuration of this sync group results in synchronization being required again in the first group. and so on. For a more detailed description of sync loops and the circumstances under which they can occur. selecting the conflict resolution policy for each sync group appropriately. you must be careful to ensure that sync loops cannot exist. For more information. When you define a sync group. using row filtering to prevent the same rows in a table participating in different sync groups. and by setting the synchronization direction for each database carefully.technet. resulting in degraded performance and increased costs. Sync loops are selfperpetuating and can result in large amounts of data being repeatedly written and see the section “Locating and Sizing the Hub Database” earlier in this see “SQL Azure Data Sync – Synchronization Loops” at http://social. . such a configuration can result in a sync loop. Figure 6 Specifying the synchronization direction for databases participating in Topology D Avoiding Sync Loops A member database can participate in more than one sync group. However. which again may render synchronization necessary in the second group. by evaluating the role of any databases that participate in multiple sync groups.on the role of the hub.

Guidelines for Using SQL Azure Data Sync You can use SQL Azure Data Sync to implement the replication topologies described earlier in this chapter. You can apply these topologies with SQL Azure Data Sync to many common scenarios, as described in the following list. If you are using Azure Traffic Manager to route requests to a datacenter, be aware that services running at different datacenters may see different data if each datacenter has its own replica database. This is because the synchronization process may not have been completed at all sites, so updates visible in one datacenter might not have been propagated to other datacenters. For more guidance about using Azure Traffic Manager, see Chapter 9, “Maximizing Scalability, Availability, and Performance.”  Applications running on-premises access a SQL Server database also held on-premises. Services running in the cloud use a copy of the same data. Any changes made at any site must eventually be propagated to all other sites, although these updates do not have to occur immediately. This is possibly the most common scenario for using SQL Azure Data Sync, and describes the situation covered by Topology A. As an example of this scenario, consider a database holding customer and order information. An application running on-premises maintains customer information, while customers exercise a web application running in the cloud that creates new orders. The web application requires access to the customer information managed by the onpremises application, and the code running on-premises frequently queries the order details to update the status of orders when they are delivered and paid for. In this example, response time is important, but neither the application running on-premises nor the web application running in the cloud require access to completely up-to-date information; as long as the data is available at some near point in the future, that is good enough. Therefore, to minimize the effects of network latency and ensure that it remains responsive, the web application employs a SQL Azure database hosted in the same datacenter as the application and the on-premises application uses a SQL Server database also located on-premises. SQL Azure Data Sync enables you to replicate and share the customer and order information between the on-premises application and the cloud by using bi-directional replication, as shown in figure 7.

Figure 7 Sharing data between the applications running in the cloud and on-premises  You have relocated the logic for your business applications to services running in the cloud. The business applications previously used data held in a SQL Server database. The services have been distributed across different datacenters, and the SQL Server database has been migrated to SQL Azure. To minimize network latency each data center has a replica of the SQL Azure database. This is the scenario that equates to Topology B. In this example, the application logic that accesses the database has been completely relocated to the cloud, so the on-premises database has been eliminated. However, the cloud based applications all require access to the same data, and may modify this information, so each instance of the SQL Azure database must be periodically synchronized with the other instances. This replication will be bi-directional. Figure 8 shows the structure of a possible solution. In this example, any of the applications may query and modify any data. Consequently, the application logic might need to be amended to handle data that may be out of date until the next synchronization cycle.

Figure 8

Replicating data between data centers in the cloud  You need to make your data held in an on-premises SQL Server database available to services running in the cloud. These services only query data and do not modify it; all modifications are performed by applications running on-premises. This is the simple variant of Topology D described earlier. In this scenario, the services that query the data execute remotely from your on-premises database. To minimize response times, you can replicate the data to one of more SQL Azure databases hosted in the same datacenters as each of the services. Using SQL Azure Data Sync, you can publish the data held on premises and periodically synchronize any updates made by the on-premises applications with the databases in the cloud. Figure 9 shows an example. This configuration requires one-way replication, with the on-premises database synchronizing to a hub database in the cloud and each of the SQL Azure member databases synchronizing from the hub.

Figure 9 Publishing an on-premises database to the cloud  You have a number of applications and SQL Server databases running on-premises. However, you have migrated much of your business intelligence and reporting functionality to services running in the cloud. This functionality runs weekly, but to support your business operations it requires query access to your business data. In this scenario, all the data modifications are performed against a number of SQL Server databases hosted within the organization by applications running on-premises. These applications may be independent from each other and operate by using completely different databases. However, assume that the business intelligence functionality performs operations that span all of these databases, querying data held across them all, and generating the appropriate reports to enable a business manager to make the appropriate business decisions for the organization. Some of these

reports may involve performing intensive processing which is why these features have been moved to the cloud.

Figure 10 Aggregating and consolidating data in the cloud You can use SQL Azure Data Sync to aggregate and consolidate data from the multiple on-premises databases into a single SQL Azure database in the cloud, possibly replicated to different datacenters as shown in figure 10. The business intelligence service at each datacenter can then query this data locally. The synchronization process only needs to be one way, from the on-premises databases to the hub and then from the hub to each SQL Azure database; no data needs to be sent back to the on-premises database. Additionally, synchronization can be scheduled to occur weekly, starting a few hours before the business intelligence service needs to run (the exact schedule can be determined based on how much data is likely to be replicated and the time required for this replication to complete). You can use the same scheme to aggregate data from multiple offices to the cloud, the only difference being that the on-premises SQL Server databases are held at different locations.  You have a number of services running in the cloud at different datacenters. The services at each datacenter maintain a separate, distinct subset of your organization’s data. However, each service may occasionally query any of the data, whether it is managed by services in that datacenter or any other datacenter. Additionally, applications running on-premises require access to all of your organization’s data. This situation occurs when the data is partitioned between different sites, as described in Topology C. In this scenario, a SQL Server database running on-premises holds a copy of the entire data for the organization, but each datacenter has a SQL Azure database holding just the subset of data required by the services running at that datacenter. This scheme allows the services running at a datacenter

to query and update its subset of the data, and periodically synchronize with the on-premises database. If a service needs to query data that it does not own, it can retrieve this data from the onpremises database. As described earlier, this mechanism necessitates implementing logic in each service to determine the location of the data, but if the bulk of the queries are performed against the local database in the same datacenter then the service should be responsive and maintain performance. Implementing this scheme through SQL Azure Data Sync requires defining a separate sync group for each SQL Azure database. This is because the sync dataset for each SQL Azure database will be different; the data will be partitioned by datacenter. The on-premises database will be a member common to each sync group. To simplify the scheme, you can specify that the SQL Azure database for each datacenter should act as its own synchronization hub. Figure 11 shows an implementation of this solution.

Figure 11 Using SQL Azure Data Sync to partition and replicate data in the cloud

Be careful to avoid introducing a sync loop if you follow this strategy; make sure that the different sync datasets do not overlap and include the same data.  Your organization comprises a head office and a number of remote branch offices. The applications running at head office require query access to all of the data managed by each of the branch offices, and may occasionally modify this data. Each branch office can query any data held in that branch office, any other branch office, or at head office, but can only modify data that relates to the branch office.

This is the scenario for Topology E. The data can be stored in a SQL Server database, and each branch office can retain a replica of this database. Other than the hub, no data is stored in the cloud. The location of the hub should be close to the most active office (possibly the head office). Synchronization should be bi-directional, and can be scheduled to occur with a suitable frequency depending on the requirement for other branch offices to see the most recent data for any other branch. If each branch only stores its own local subset of the data, you will need to create a separate sync group for each branch database with the appropriate sync dataset, as described in the previous scenario. If the sync datasets for each branch do not overlap, it is safe to use the same SQL Azure database as the synchronization hub for each sync group. Figure 12 shows a possible structure for this solution.

Figure 12 Using SQL Azure Data Sync to partition and replicate data across branch offices  Many of your services running in the cloud perform a large number on on-line transaction processing (OLTP) operations, and the performance of these operations is crucial to the success of your business. To maintain throughput and minimize network latency you store the information used by these services in a SQL Azure database at the same datacenter hosting these services. Other services at the same site generate reports and analyze the information in these databases. Some of this reporting functionality involves performing very complex queries. However, you have found that performing these queries causes contention in the database that can severely impact the performance of the OLTP services. In this scenario, the solution is to replicate the database supporting the OLTP operations to another database intended for use by the reporting and analytical services, implementing a read scale-out strategy. The synchronization only needs to be performed one-way, from the OLTP database to the reporting database, and the schedule can be selected so as to synchronize data during off-peak hours. The OLTP database can perform the role of the hub. Additionally, the reporting database can

The communications between the SQL Azure Data Sync Server and the Client Agent are encrypted by using a key unique to each instance of the agent. the number of indexes in the OLTP database should be minimized to avoid the overhead associated with maintaining them during update-intensive operations. . The Client Agent software consists of two elements:   A Windows service that connects to the on-premises databases. Figure 13 Replicating a database to implement read scale-out SQL Azure Data Sync Security Model To use SQL Azure Data Sync to replicate and synchronize on-premises databases. including the credentials used to connect to each on-premises database and SQL Azure database. The SQL Azure Data Sync Server in the cloud connects to your databases hosted on-premises through the Client Agent. Figure 13 shows this solution. You generate the agent key by using the Azure Management portal. the tables can be configured with indexes to speed the various data retrieval operations required by the analytical services. the Client Agent uses an output HTTPS connection to communicate with the SQL Azure Data Sync Server. you must download and install the Data Sync Client Agent on one of your on-premises servers. optimized for query purposes. all sensitive configuration information used by the Client Agent and SQL Azure Data Sync Server are encrypted by using the same key. and A graphical utility for configuring the agent key and registering on-premises databases with this service. In contrast.

The Client Agent service must be configured to run using a Windows account that has the appropriate rights to connect to each server hosting on-premises databases to be synchronized; these databases do not have to be located on the same server as the Client Agent service. When you register an on-premises database with the Client Agent you must provide the login details for accessing the SQL Server hosting the database, and this information is stored (encrypted) in the Client Agent configuration file. When the SQL Azure Data Sync Server synchronizes with an on-premises database, the Client Agent uses these details to connect to the database.

For additional security, if the Client Agent is running on a different server from your databases, you can configure the Client Agent service to encrypt the connection with each on-premises database by using SSL. This requires that you have installed the appropriate SSL certificates installed in each instance of SQL Server. For more information, see “Encrypting Connections to SQL Server” at For more information about the SQL Azure Data Sync security model, see “SQL Azure Data Sync – Data Security” at

Implementing Custom Replication and Synchronization Using the Sync Framework SDK
The Azure Management Portal enables you to replicate and synchronize SQL Server and SQL Azure databases without writing any code, and it is suitable for configuring many common replication scenarios. However, there may be occasions when you require more control over the synchronization process, for example a service may need to force a synchronization to occur at a specific time. For example, if you are caching data by using the Windows Azure .NET Libraries Caching service, using SQL Azure Data may render any cached data invalid after synchronization occurs. You may need to more closely coordinate the lifetime of cached data with the synchronization frequency, perhaps arranging to flush data from the cache when synchronization occurs, to reduce the likelihood of this possibility. You might also need to implement a different conflict resolution mechanism from the policies provided by SQL Azure Data Sync, or replicate data from a source other than SQL Server. You can implement a just such customized approach to synchronization in your applications by using the Sync Framework SDK. Chapter 9 provides further details and guidance on using the Windows Azure .NET Libraries Caching service. The Sync Framework 2.1 SDK includes support for building applications that can synchronize with SQL Azure. Using this version of the Sync Framework SDK, you can write your own custom synchronization code and control the replication process directly. Using this approach, you can address a variety of scenarios that are not easy to implement by using SQL Azure Data Sync, such as building offline-capable/roaming applications. As an example, consider an application running on a mobile device such as a notebook computer used by a boiler repair man. At the start of each day, the repair man uses the application to connect to the local branch office and receive a work schedule with a list of customers’ addresses and boiler-repair jobs. As the repair man completes each job, he records the details using the application running on the mobile device, and this application stores

information in a database held on the same device. Between jobs, the repair man connects to the branch office again and uploads the details of the work completed so far, and the application also downloads any amendments to the work schedule for that day. In this way, the repair man can be directed to attend an urgent job prior to moving on to the previously scheduled engagement, for example. Every Friday afternoon, an administrator in the branch office generates a report detailing the work carried out by all repair men reporting to that branch. If the branch office database is implemented by using SQL Azure, the mobile application running on the mobile device can use the Sync Framework SDK to connect to the datacenter hosting this database and synchronize with the local database on the device. Figure 14 shows a simplified view of this architecture.

Figure 14 Using the Data Sync SDK to implement custom synchronization For more information about using the Sync Framework SDK with SQL Azure, see “SQL Server to SQL Azure Synchronization using Sync Framework 2.1” at

Replicating and Synchronizing Data Using Service Bus Topics and Subscriptions
SQL Azure Data Sync provides an optimized mechanism for synchronizing SQL Server and SQL Azure databases, and is suitable for an environment where changes can be batched together, propagated as a group, and any conflicts resolved quickly. In a dynamic environment such batch processing may be inappropriate; it may be necessary to replicate changes as they are made rather than batching them up and performing them at some regular interval. In these situations, you may need to implement a custom strategy. Fortunately, this is a well-researched area, and several common patterns are available. This section describes two generic scenarios that cover most eventualities, and summarizes how you might implement solutions for these scenarios by using Service Bus Topics and Subscriptions.

Guidelines for Using Service Bus Topics and Subscriptions Chapter 7, “Implementing Business Logic and Routing Messages” describes how you can use Service Bus Topics and Subscriptions to implement a reliable infrastructure for routing messages between sender and receiver applications. You can exploit this infrastructure to provide a basis for constructing a highly customizable mechanism for synchronizing data updates made across a distributed collection of data sources, as described by the following scenarios:  Concurrent instances of applications or services running as part of your system require read and write access to a set of distributed resources. To maintain responsiveness, read operations should be performed quickly, and so the resources are replicated to reduce network latency. Write operations can occur at any time, so you must implement controlled, coordinated write access to each replica to reduce the possibility of any updates being lost. As an example of this scenario, consider a distributed system that comprises multiple instances of an application accessing a database. The instances run in various datacenters in the cloud, and to minimize network latency a copy of the database is maintained at each datacenter. If an application instance at datacenter A needs to modify an item in the database at datacenter A, the same change must be propagated to all copies of the database residing at other datacenters. If this does not happen in a controlled manner, application instances running in different datacenters might update the local copy of the same data to different values, resulting in a conflict, as shown in figure 15.

Figure15 Conflict caused by uncontrolled updates to replicas of a database

In the classic distributed transaction model, you can address this problem by implementing transaction managers coordinating with each other by using the Two-Phase Commit protocol (2PC). However, although 2PC guarantees consistency it does not scale well, and in a global environment based on unreliable networks, such as the cloud, it could lead to data being locked for excessively long periods reducing the responsiveness of applications that depend on this data. Therefore you must be prepared to make some compromises between consistency and availability. One way to approach this problem is to implement update operations as BASE transactions. BASE is an acronym for Basic Availability, Soft-state, and Eventual consistency, and is an alternative viewpoint to traditional ACID (Atomic, Consistent, Isolated, and Durable) transactions. With BASE transactions, rather than requiring complete and total consistency all of the time, it is considered sufficient for the database to be consistent eventually, as long as no changes are lost in the meantime. What this means in practice, in the example shown in figure 15, is that an application instance running at Datacenter A can update the database in the same datacenter, and this update must be performed in such a way that it is replicated in the database a Datacenter B. If an application instance running at Datacenter B updates the same data, it must likewise be propagated to Datacenter A. The key point is that after both of the updates are complete, the result should be consistent and both databases should reflect the most recent update. There may be a period during which the modification made at Datacenter A has yet to be copied to Datacenter B, and during this time an application instance running at Datacenter B may see old, stale data, so the application has to be designed with this possibility in mind; consider the orders processing system displaying the stock levels of products to customers cited earlier in this chapter as an example. Service Bus Topics and Subscriptions provide one solution to implementing controlled updates in this scenario. Application instances can query data in the local database directly, but all updates should be formatted as messages and posted to a Service Bus Topic. A receiving application located in each datacenter has its own Subscription for this Topic with a filter that simply passes all messages through to this Subscription. Each receiver therefore receives a copy of every message and it uses these messages to update the local database. Figure 16 shows the basic structure of this solution.

Figure 16 Routing update messages through a Service Bus Topic and Subscriptions You can use the Enterprise Library Transient Fault Handling Block to provide a structure for posting messages reliably to a Topic and handling any transient errors that may occur. As an additional safeguard, you should configure the Topic with duplicate detection enabled, so if the same update message does get posted twice any duplicates will be discarded. The receiver should retrieve messages by using the PeekLock receive mode. If the update operation succeeds, the receive process can be completed, otherwise it will be abandoned and the update message will reappear on the Subscription. The receiver can retrieve the message again and retry the operation.

Instances of a long-running service executing in the cloud access a single remote data source, but to maintain response times each instance has a cache of the data that it uses. The volume of data is reasonably small and is held in-memory. The data acting as the source of the cache may be updated, and when this happens each service instance should be notified so that it can maintain its cached copy. In this scenario, the cache for each instance of the long-running service is seeded with data from the data source when it starts up. If the data source is modified, the process performing the modification can also post a message to a Service Bus Topic with the details of the change. Each service instance has a subscription to this topic, and uses the messages posted to this topic to update its local copy of the cached data, as shown in Figure 17.

Figure 17 Implementing update notifications by using a Service Bus Topic and Subscriptions

How Trey Research Replicates, Distributes, and Synchronizes Data
This section is provided for information only. The Trey Research example solution is deployed to a single datacenter and therefore is not configured to replicate and synchronize data across multiple datacenters. Trey Research stores information about products, customers, and the orders that these customers have placed. As described in Chapter 4, “Deploying Functionality to the Cloud”, the orders data is maintained exclusively in the cloud by the Orders application, customer data is queried and updated both in the cloud and on-premises, and the product details are only updated on-premises although they are queried by the Orders application in the cloud. Trey Research uses a SQL Server database to hold the customer and product details and the Orders application access information held in a local SQL Azure database at each datacenter. Figure 18 highlights these elements of the Trey Research solution and the replication requirements of each type of data.

Figure 18 Replication requirements of the Trey Research solution In this solution, the Customers data is synchronized bi-directionally, following the principles described in Topology A earlier. The Products data is synchronized one-way, from the on-premises database to the cloud. This approach implements the mechanism described by Topology D. Finally, the Orders data is replicated bidirectionally between datacenters as described by Topology B. Figure 19 shows one possible physical implementation of these schemes based on SQL Azure Data Sync. This implementation uses three sync groups; each sync group defines a sync dataset and conflict resolution policy for each type of data. Notice that, because the Trey Research head office is located in Illinois, the SQL Azure databases located in the US North Data Center also act as the synchronization hubs; this is the nearest datacenter to the head office, so selecting this location helps to reduce the network latency when synchronizing with the on-premises database.

Figure 19 Physical implementation of data synchronization for Trey Research

This chapter has described possible approaches to replicating and synchronizing data across the different sites participating in a hybrid solution. If you are using SQL Server and SQL Azure to store information, then using SQL Azure Data Sync is a simple and natural way to implement this feature. However, if you require more control over the synchronization process or you are using a data source other than SQL Server, you can use the Sync Framework SDK to implement a custom mechanism. SQL Azure Data Sync and the Sync Framework are oriented towards performing synchronization as a scheduled task, applying the changes made since the last synchronization event occurred as a batch. Depending on the synchronization interval, this batch may include a large number of updates and an application may have been using outdated information in the intervening period. If you require a more immediate approach that propagates changes as soon as they occur, you can communicate these changes through Service Bus Topics and Subscriptions. This approach is more complex and requires you to write more code, but the resulting solutions can be much more flexible and adaptable to changing circumstances.

More Information
 “SQL Azure Data Sync Overview” at

 

“Windows Azure Platform FAQs - SQL Azure Data Sync” at “SQL Azure Data Sync- Synchronize Data across On-Premise and Cloud (E2C)” at “SQL Server to SQL Azure Synchronization using Sync Framework 2.1” at

For links and information about the technologies discussed in this chapter see Chapter 1, "Introduction" of this guide.

Maximizing Scalability, Availability, and Performance
A key feature of the Windows Azure architecture is the robustness that the platform provides. A typical Azure solution is implemented as a collection of one or more roles where each role is optimized for performing a specific category of tasks. For example, a web role is primarily useful for implementing the web front-end that provides the user interface of an application, while a worker role typically performs the underlying business logic such as performing any data processing required, interacting with a database, orchestrating requests to and from other services, and so on. If a role fails, Azure can transparently start a new instance and the application can resume. However, irrespective of how robust an application is, it must also perform and respond quickly. Azure supports highly-scalable services through the ability to dynamically start and stop instances of an application, enabling an Azure solution to handle an influx of requests at peak times, while scaling back as the demand lowers, reducing the resources consumed and the associated costs.

If you are building a commercial system, you may have a contractual obligation to provide a certain level of performance to your customers. This obligation might be specified in a service level agreement (SLA) that guarantees the response time or throughput. In this environment, it is critical that you understand the architecture of your application, the resources that it utilizes, and the tools that Windows Azure provides for building and maintaining an efficient system. However, scalability is not the only issue that affects performance and response times. If an application running in the cloud accesses resources and databases held in your on-premises servers, bear in mind that these items are no longer directly available over your local high-speed network. Instead the application must retrieve this data across the Internet with its lower bandwidth, higher latency, and inherent unpredictably concerning traffic speed and throughput. This can result in increased response times for users running your applications or reduced throughput for your services. Of course, if your application or service is now running remotely from your organization, it will also be running remotely from your users. This might not seem so much of an issue if you are building a publicfacing web site or service because the users would have been remote prior to you moving functionality to the cloud, but this change may impact the performance for users inside your organization who were previously accessing your solution over a local area network. Additionally, the location of an application or service can affect its perceived availability if the path from the user traverses network elements that are

summarizing many of the challenges you will face when you implement solutions to meet these requirements. and the costs associated with running your system rise accordingly. Therefore it makes economic sense to scale back the number of service instances and resources as demand for your system decreases. there are a couple of problems with this solution: . It describes solutions and good practice for addressing these concerns by using Windows Azure technologies. This chapter considers issues associated with maintaining performance. These peaks may arise at expected times. In other types of application the load may surge unexpectedly. for example. and you can start new instances of a service to meet demand as the volume of requests increases. with a regular volume of requests of a predictable nature. A crucial task. your users will be unable to connect. If the load increases further. How can you achieve this? One solution is to monitor the solution and start up more service instances as the number of requests arriving in a given period of time exceeds a specified threshold value. In quiescent periods. for example. availability. However. and network latency. therefore. requests to a news service may flood in if some dramatic event occurs. there may be occasions when the load dramatically and quickly increases. you can define additional thresholds and start yet more instances. Scaling can help to ensure that sufficient resources are available. Requirements and Challenges The primary causes of extended response times and poor availability in a distributed environment are lack of resources for running applications. the more resources they occupy. and performance. However. and network traffic times-out as a result. However. and it may experience an especially large spike in usage towards the end of the financial year. in the event of a catastrophic regional outage of the Internet or a failure at the datacenter hosting your applications and services. For much of the time the load may be steady. Finally. but no matter how much effort you put into tuning and refining your applications users will perceive that your system has poor performance if these applications cannot receive requests or send responses in a timely manner because the network is slow. Managing Elasticity in the Cloud Description: Your system must support a varying workload in a cost-effective manner. the more instances of a service you run. If the volume of requests later falls below these threshold values you can terminate the extra instances. is to organize your solution to minimize this network latency by making optimal use of the available bandwidth and utilizing resources as close as possible to the code and users that need them. an accounting system may receive a large number of requests as the end of each month approaches when users generate their month-end reports. Many commercial systems must support a workload that can vary considerably over time. and ensuring that users can always access your application when you relocate functionality to the cloud. reducing application response times. The following sections identify some common requirements concerning scalability. it might only be necessary to have a minimal number of service instances.heavily congested. The cloud is a highly scalable environment.

for example. You should adopt a strategy that minimizes this distance and reduces the associated network latency for users accessing your system. The distance between users and the applications and services they access can have a significant bearing on the response time of the system. you could consider replicating your cloud applications and hosting them in datacenters that are similarly dispersed. Users could then connect to the closest available instance of the application. If your users are geographically dispersed. in the afternoon in Europe. How can you balance the cost of connecting to an instance of an application running on a heavily loaded server against that of connecting to an instance running more remotely but on a lightly-loaded server? . perhaps in a different continent. but consider the following issues:   What happens if the instance of an application most local to a user fails. traffic to datacenters in European locations may be a lot heavier than traffic in the Far East or West Coast America. It is unlikely to be feasible to perform these tasks manually as peaks and troughs in the workload may occur at any time. whilst a user connecting to the application may be located in another. You must automate the process that starts and stops service instances in response to changes in system load and the number of requests. so any performance measurements should include a predictive element based on trends over time. Consequently the process that predicts performance and determines the necessary thresholds may need to perform calculations that measure the use of a complex mix of resources. a small number of requests that each incur intensive processing might also impact performance. The question that you need to address in this scenario is how do you direct a user to the most local instance of an application? Maximizing Availability for Cloud Applications Description: Users should always be able to connect to the application running in the cloud. It may take several minutes for Windows Azure to perform these tasks. The number of requests that occur in a given interval might not be the only measure of the workload. and initiate new service instances so that they are ready when required. Reducing Network Latency for Accessing Cloud Applications Description: Users should be connected to the closest available instance of your application running in the cloud to minimize network latency and reduce response times. A cloud application may be hosted in a datacenter in one part of the world. For example. or no network connection can be established? The instance of an application most local to a user may be heavily loaded compared to a more distant instance. How do you ensure that your application is always running in the cloud and that users can connect to it? Replicating the application across datacenters may be part of the solution.  Remember that starting and stopping service instances is not an instantaneous operation.

Windows Azure Traffic Manager. Aside from the design and implementation of the application logic. You can use this service to cache data in the cloud and provide scalable.Optimizing the Response Time and Throughput for Cloud Applications Description: The response time for services running in the cloud should be as low as possible. Some or all of these resources might be located remotely in other datacenters or on-premises servers. and the throughput should be maximized. specifically. the cached copy used by other instances of this or other services may now be out of date. Operations that access remote resources may require a connection across the Internet. However. available computing power alone does not guarantee that an application will be responsive. You can use this application block to define performance indicators. You can use this service to reduce network latency by directing users to the nearest instance of an application running in the cloud. you can cache these resources locally to the service. but this approach leads to two obvious questions:   What happens if a resource is updated remotely? The cached copy used by the service will be outof-date. Azure and Related Technologies Windows Azure provides a number of technologies that can help you to address the challenges presented by each of the requirements in this chapter:  Enterprise Library Auto-scaling Application Block. the key factor that governs the response time and throughput of a service is the speed with which it can access the resources it needs. so how should the service detect and handle this situation? What happens if the service itself needs to update a resource? In this case. The cloud is not a magic remedy for speeding up applications that are not designed with performance and scalability in mind. automatically directing user requests to the next available service instance. and start and stop instances of services to maintain performance within acceptable parameters. Windows Azure Caching. Caching is also a useful strategy for reducing contention to shared resources and can improve the response time for an application even if the resources that it utilizes are located locally. measure performance against these indicators. the issues associated with caching remain the same. and shared access for multiple applications. Windows Azure is a highly scalable platform that offers high performance for applications. and this approach has been described throughout this guide. dependent operations to complete. The solution is to perform these operations asynchronously. Traffic Manager can also detect whether an instance of a service has failed or is unreachable. However. if a local resource is modified the cached data is now out of date. To mitigate the effects of network latency and unpredictability. reliable.   . an application that is designed to function in a serial manner will not make best use of this platform and may spend a significant period blocked waiting for slower.

for example. and is not discussed further in this book. switching off non-essential features or gracefully degrading its UI as loading and demand increases. However. Azure Traffic Manager. Managing Elasticity in the Cloud by Using the Microsoft Enterprise Library Autoscaling Application Block It is possible to implement a custom solution that manages the number of deployed instances of the Web and Worker roles your application uses. The data returned from a web application or service may be of a considerable size. but the block also enables you to use other scaling actions such as throttling certain functionality within your application. CDN has been described in detail in chapter 3. and Azure caching.0 Integration Pack for Windows Azure.  . and provide guidance on how to use them in a number of scenarios. Throttling auto-scaling rules are a specific implementation of Reactive rules. this is far from a simple task and so it makes sense to consider using a pre-built framework that is sufficiently flexible and configurable to meet your requirements. which allow an application to modify its behavior and change its resource utilization by. However. The Enterprise Library Auto-scaling Application Block provides such a solution. either by using a custom application that connects to them or by using a web browser. Throttling auto-scaling rules. which allow you to adjust the number of instances of a role or set of roles based on aggregate values derived from data points collected from your Windows Azure environment or application. For more information. When a user makes a request. the data can be served from the most optimal location based on the current volume of traffic at the various Internet nodes through which the requests are routed. The following sections describe the Enterprise Library Auto-scaling Application Block. “Accessing the Surveys Application” in the guide “Developing Applications for the Cloud on the Microsoft Windows Azure Platform”. Azure caching is primarily useful for improving the performance of web applications and services running in the cloud. while at the same time minimize and control hosting costs. and if the user is very distant it may take a significant time for this data to arrive at the user's desktop. Scaling operations typically alter the number of role instances in your application. You can use these rules to help your application or service maintain its throughput in response to changes in its workload. CDN enables you to cache frequently queried data at a variety of locations around the world. visit http://msdn. You can use this service to improve the response time of web applications by caching frequently accessed BLOB data closer to the users that request them. Reactive Content Delivery Network (CDN). The Auto-scaling Application Block enables you to specify the following types of rules:   Constraint rules. and can automatically scale your Windows Azure application or service based on rules that you define specifically for that application or service.aspx. This means that there are opportunities to achieve very subtle control of behavior based on a range of pre-defined and dynamically discovered conditions. users will frequently be invoking these web applications and services from their desktop. which enable you to set minimum and maximum values for the number of instances of a role or set of roles based on a timetable. It is part of the Microsoft Enterprise Library 5.

   . Figure 1 shows how the number of instances of a role may change over time within the boundaries defined for the minimum and maximum number of instances. At point B in the chart. with the range set to a minimum of four and a maximum of six instances. Figure 1 Chart showing how the Auto-scaling Block can adjust the number of instances of a role The behavior shown in Figure 1 was the result of the following configuration of the Auto-scaling Application Block:  A default Constraint rule that is always active. minimize response times. to conform to SLAs. The chart shows how. at point A. at point D in the chart the block increases the number of roles to four and then to five as processor load increases. A Reactive rule that increases the number of deployed role instances by one when the value of the Avg_CPU_RoleA operand is greater than 80. For example. even if the load on the application justifies this. A Constraint rule that is active every day from 8:00AM for two hours. or in a custom store that you create. with the range set to a minimum of two and a maximum of five instances. this rule prevents the block from deploying any additional instances. By applying a combination of these rules you can ensure that your application or service will meet demand and load requirements.Rules are defined in XML format in a similar way to typical . even during the busiest periods.NET configuration files and can be stored in Windows Azure BLOB storage. and ensure availability while still minimizing operating costs. An Operand named Avg_CPU_RoleA bound to the average value over the previous 45 seconds of the Windows performance counter \Processor(_Total)\% Processor Time. How the Auto-scaling Application Block Manages Role Instances The Auto-scaling Application Block can monitor key performance indicators in your application roles and automatically deploy or remove instances. For example. the block deploys a new instance of the role at 8:00AM. in a file.

Each rule consists of one or more operands that define how the block matches the data from monitoring points with values you specify. or you can define a scale group in the configuration of the block that includes the names of more than one role and then target the group so that the block scales all of the roles defined in the group. Operands that define the data points for monitoring activity of a role can use any of the Windows performance counters. A Reactive rule that decreases the number of deployed role instances by one when the value of the Avg_CPU_RoleA operand falls below 20. including fixed periods and fixed durations. You can also negate the tests using the not function. and equal tests. at point C in the chart the block has reduced the number of roles to three as processor load has decreased. The Auto-scaling Application Block provides the following types of actions that you can specify in a Reactive rule:  The scale action specifies that the block should increase or decrease the number of deployed role instances. less than or equal. For  . Constraint Rules Constraint rules are used to specify the schedule during which the block will monitor your application. The block changes this setting and the application responds immediately by reading the new setting. and nest functions using AND and OR logical combination. By specifying the appropriate set of rules for the Auto-scaling Application Block you can configure automatic scaling of the number of instances of the roles in your application to meet known demand peaks and to respond automatically to dynamic changes in load and demand. The changeSetting action allows you to specify a new value for a setting in the application's service configuration file. monthly. and relative recurring events such as the last Friday of each month. and yearly recurrence. Code in the application can use this setting to change its behavior. The Auto-scaling Application Block reads performance counter data from the Windows Azure Diagnostics table named WADPerformanceCountersTable in Windows Azure storage. Reactive Rules and Actions Reactive Rules specify the conditions and actions that change the number of deployed role instances or the behavior of the application. There is a comprehensive set of options for specifying the range of times for a Constraint rule. For example. You specify the target role using the name. weekly. daily. you must run code in your role when it starts that configures the Azure diagnostics to collect the required information and then starts the diagnostics monitor. less than. greater than or equal. Reactive rule operands can use a wide range of comparison functions to define the conditions under which the related actions will occur. and one or more actions that the block will execute when the operands match the monitored values. or you can create a custom operand that is specific to your own requirements such as the number of unprocessed orders or the number of messages in a queue. These functions include the typical greater than. Azure does not populate this with data from the Azure diagnostics monitor by default.

They simply use the UTC offset that you specify at all times. see http://entlib. You can also configure several aspects of the way that the block works such as the scheduler that controls the monitoring and scaling activates. it may switch off non-essential features or gracefully degrade its UI to better meet increased demand and load.0 Integration Pack for Windows Azure. multiple instances may interfere and conflict with each other.example. Alternatively. This makes it easy to update the rules and data when managing the application. For more information about the Microsoft Enterprise Library 5.  The capability to execute a custom action that you create and deploy as an assembly. if required. and the application can react to this to reduce its demand on the underlying infrastructure. or instead of scaling the role. You must define a constraint rule for each monitored role instance. and the enforced “cool down” delay between changes to the number of deployed instances. You enable BLOB execution leases in the block's configuration by setting the UseBlobExecutionLease option to true. consider creating a custom rule store. Guidelines for Using the Auto-scaling Application Block The following guidelines will help you understand how you can obtain the most benefit from using the Autoscaling Application Block:  If you are hosting the Auto-scaling Application Block in a worker or web role in the cloud. such as sending an email notification or running a script to modify a SQL Azure database. if you want to implement special functionality for loading and updating rules. You can read the performance      . The code in the assembly can perform any appropriate action. You can use the Auto-scaling Application Block to force your application to change its behavior automatically to meet changes in load and demand. Consider using Windows Azure BLOB Storage to hold your rules and service information. Use the ranking for each Constraint or Reactive rule you define to control the priority where conditions overlap. When using operands that read performance counter data. Constraint rules do not take into account daylight saving times. The block can change the settings in the service configuration file. The Auto-scaling Application Block logs events that relate to scaling actions and can send notification emails in response to the scaling of a role. The Auto-scaling Application Block provides a mechanism based on BLOB execution leases to control activation and ensure that only one instance of the block is running at a time. you must make sure that only a single instance of the Auto-scaling Application Block is active at any one time.codeplex. Use scaling groups to define a set of roles that you target as one action to simplify the rules. This also makes it easy to add and remove roles from an action without needing to edit every consider using average times of half or one hour to even out the values and provide more consistent results.

When you use Traffic Manager. However. Remember that you must write code that initializes the Azure Diagnostics mechanism when your role starts and copies it to the WADPerformanceCountersTable in Windows Azure storage. it continues to monitor the instance at predetermined intervals (the default is 30 seconds) and will start to route requests to it. if it again becomes available. web browsers and services accessing your application will perform a DNS query to Traffic Manager to resolve the IP address of the endpoint to which they will connect. on any HTTP or HTTPS port. it does not have to be the one to which the rule action applies. This means that the total time between a failure and that instance being marked as offline is three times the monitoring interval you specify. and a management certificate that it uses to connect to the subscription. irrespective of the datacenter location. When setting the interval. It provides a mechanism for routing requests to multiple instances of your Azure-hosted applications and services. you must specify the ID of the Windows Azure subscription that hosts the target applications. To allow the block to access applications. Remember that it can take several minutes for newly deployed role instances to start handling requests. You can change the interval between the monitoring checks. just as they would when connecting to any other website or for any hosted application or service.   Reducing Network Latency for Accessing Cloud Applications with Azure Traffic Manager Windows Azure Traffic Manager is a Windows Azure service that enables you to set up request routing and load balancing based on predefined policies and configurable rules. keeping the application running within the required constraints. whereas changes to throttling behavior occur almost immediately. If it detects that a service instance is offline it will not route any requests to it.    Consider enabling and disabling rules instead of deleting them from the configuration when setting up the block and when temporary changes are made to the application. be aware that Traffic Manager does not mark an instance as offline until it has failed to respond three times in succession. based on the configured load balancing policy. . Traffic Manager monitors the availability and response time of each application or service you configure in a policy. The applications or services could be deployed in one or more datacenters. Consider using the throttling behavior mechanism as well as scaling the number of roles. How Traffic Manager Routes Requests Traffic Manager is effectively a DNS resolver. Regularly analyze the information that the block logs about its activities to evaluate how well the rules are meeting your initial requirements. The service that hosts the target roles and the service that hosts the Auto-scaling Application Block do not have to be in the same Windows Azure subscription. and meeting any SLA commitments on availability and response times. This can provide more fine-grained control of the way that the application responds to changes in load and demand.

net will be passed to Traffic Manager. When this interval expires. Instead. This means that you can offer users a single URL that is aliased to the address of your Traffic Manager instance.treyresearch. Figure 2 illustrates this scenario.trafficmgr. which will perform the required routing by returning the IP address of the appropriate target service instance. and returns an IP address resulting from evaluating the rules and configuration settings for that policy. Figure 2 How Traffic Manager Performs Routing and Redirection The default Time-to-Live (TTL) value for the DNS responses that Traffic Manager will return to clients is 300 seconds (five minutes). If you have named your Traffic Manager instance as treyresearch and have a policy for the Orders application named ordersapp. effectively routing them based on the policy you select and the rules you Traffic Manager uses the subdomain parts of the URL in the query to identify the policy to apply. such as from smart clients accessing web services exposed by your application. in your own or your ISPs DNS to the entry point and policy of your Traffic Manager instance. any requests made by a client application may need to be .treyresearch. you would map the URL in your DNS to http://ordersapp. All DNS queries for store. Traffic Manager returns the IP address of the application instance that best satisfies the configured policy and rules.Traffic Manager does not perform HTTP redirection or use any other browser-based redirection technique because this would not work with other types of requests. such as http://store. For example. The user's web browser or the requesting service then connects to that IP address. you could map the URL you want to expose to users of your application.treyresearch. it acts as a DNS resolver that the client queries to obtain the IP address of the appropriate application endpoint.

This ensures that clients will still receive an IP address in response to a DNS query. Remember that there may be intermediate DNS servers between clients and Traffic Manager that are likely to cache the DNS record for the time you specify. Typically this is around ten minutes. This means that the actual time before your application sees an updated DNS result can be several times that TTL value you specify. This expiry time is typically 30 minutes. even if the service is unreachable. Also consider the impact on the overall operation of the application of the processes you execute in the monitoring page. Using Monitoring Endpoints When you configure a policy in Traffic Manager you specify the port and relative path and name for the endpoint that Traffic Manager will access to test if the application is responding. so it is a good idea to have a consistent name and location for the monitoring endpoints in all your applications and services so that the relative path and name is the same and can be used in any policy. ensure that they can always respond within five seconds so that Traffic Manager does not mark them as being offline. This may not be the instance in the data center that is closest in purely geographical terms. As long as it receives an HTTP “200 OK” response within five seconds. If you implement special monitoring pages in your applications. however. For example. For testing purposes you may want to reduce this value.just-dnslookup. If Traffic Manager detects that every service defined for a policy is offline. but instead the one that provides the best average response time. A few organizations or ISPs may implement a longer caching policy on their DNS servers. it will act as though they were all online. and the new address that results can be used to connect to the service. Global experiments undertaken by the team that develops Traffic Manager indicate that DNS updates typically propagate to within ten minutes in 97% of cases. client applications often cache the DNS entries they obtain and so will not be redirected to a different application or service instance until the cached entries expire. Traffic Manager Policies At the time of writing Traffic Manager offers the following three routing and load balancing policies. However. . Hosted applications and services can be included in more than one policy in Traffic Manager. and continue to hand out IP addresses based on the type of policy you specify. instead choosing the next closest working instance. though more may be added in the future:  The Performance policy redirects requests from users to the instance in the closest data center. if you have a page that performs a test of all functions in the application you can specify this as the monitoring endpoint.resolved again. By default this is port 80 and “/” so that Traffic Manager tests the root path of the application. Traffic Manager will assume that the hosted service is online. which will extend the time before the new DNS data is applied. but you should use the default or longer in a production scenario. This means that it takes into account the performance of the network routes between the customer and the data You can specify a different value for the relative path and name of the monitoring endpoint if required. You can check the global propagation of DNS entries using a site such as http://www. Traffic Manager also detects failed instances and does not route to these.

The Failover policy also provides an opportunity for staging and testing applications before release. Ensure that Traffic Manager can correctly monitor your hosted applications or services. and Traffic Manager will route requests to the first one in the list that it detects is responding to requests. Traffic manager will route requests to the next instance in the list. This policy evens out the loading on each instance. is responding more slowly than other instances elsewhere. though it detects failed instances and does not route to these. If you specify a monitoring page instead of the default “/” root path. or global Internet traffic patterns indicate that applications running in another location will provide better responses. The Failover policy allows you to configure a prioritized list of instances. use the facility to enable and disable policies instead of adding and removing policies. Include the datacenter name in the service name so that it is easy to identify the datacenter in which the service is hosted. Traffic Manager may select a location that is not the geographically nearest. Use a naming pattern makes it easier to search for related services using part of the name. Keep in mind that. You can deploy different versions of the application. General Guidelines for Using Azure Traffic Manager The following list contains general guidelines for using Traffic Manager:  When you name your hosted services and services. consider using a naming pattern that makes them easy to find and identify in the Traffic Manager list of services. ensure that the page always responds with an HTTP “200 OK” status. and responds well within the five seconds limit. taking into account the geographical location of the originator of requests and the geographical location of each configured application in the policy (Traffic Manager periodically runs its own internal tests across the Internet between specific locations worldwide and each datacenter). but may not provide users with the best possible response times as it ignores the relative locations of the user and data center. The Round Robin policy routes requests to each instance in turn. This means that the closest one may always not be the geographically nearest. Traffic Manager bases its selection of target application on availability and average response time. although this will usually be the case. if the application in the geographically nearest datacenter has failed to respond to requests. If that instance fails. but the backup instance(s) are not designed or configured to be in use all of the time. To simplify management and administration. The Failover policy is useful if you want to provide backup for an application. for backup or failover use only when the main instance(s) are unavailable. accurately detects the state of the application. and so on.  To minimize network latency and maximize performance you will typically use the Performance policy to redirect all requests from all users to the instance in the closest data center. during maintenance cycles or when upgrading to a new version. The other policies are described in the section “Maximizing Availability for Cloud Applications with Azure Traffic Manager” later in this chapter. such as restricted or alternative capability versions. Create as many policies as you need and enable only those that are   . when using the Performance policy. However. The following sections describe the Performance policy.

However. General Limitations of Using Azure Traffic Manager The following list identifies some of the limitations you should be aware of when using Traffic Manager:  All of the hosted applications or services you add to a Traffic Manager policy must exist within the same Windows Azure subscription. For example. Traffic Manager will act as though all were online and route requests based on its internal measurements of global response times based on the location of the client making the request. However.currently applicable. You can then use the Traffic Manager Web portal to see which instances of all of the applications and services are online and offline. even if you do not deploy your application in multiple datacenters or require routing to different instances.   If all of the hosted applications or services in a Performance policy are offline or unavailable. This means that clients will be able to access the application as soon as it comes back online. Disable and enable individual services within a policy instead of adding and removing services. You may be able to deploy fewer role instances at some times to minimize runtime cost. although they can be in different namespaces.  Consider using Traffic Manager as a rudimentary monitoring solution. Ensure that sufficient role instances are deployed in each application to ensure adequate performance. consider using a mechanism such as that described earlier in this chapter to automatically deploy and remove instances when demand changes. . Guidelines for Using Azure Traffic Manager to Reduce Network Latency The following list contains guidelines for using Traffic Manager to reduce network latency:   Choose the Performance policy so that users are automatically redirected to the datacenter and application instance that should provide best response times. Again. and so there must be the capability for users to select a language and culture. Consider if the patterns of demand in each datacenter are cyclical or time-dependent. consider changing the default language or culture settings to suit the location of the datacenter. Consider changing the default regional or locale-related features of the applications in each datacenter to reflect the needs of the typical user. remember that failure in another datacenter may reroute users in other locations to this one. You can set up a policy that includes all of your application and service instances (including different applications) by using “/” as the monitoring endpoint. and lists of related links or other features that are not applicable in all parts of the world. you do not direct client requests to Traffic Manager for DNS resolution. clients connect to the individual application and service instances using the specific URLs you map for each one in your DNS. without the delay while Traffic Manager detects this and starts redirecting users based on measured response times. Instead. and consider using a mechanism such as that implemented by the Auto-scaling Application Block earlier in this chapter to automatically deploy additional instances when demand increases.

routing for new clients (when the old DNS data is not cached by the client) will usually occur within ten minutes. However. “Replicating. rerouting of a client currently using the application to a different instance may not occur for up to one hour. or custom repository or cache you created. Choose a synchronization pattern that minimizes the chances of data collision. All of the hosted applications or services must expose the same operations and use HTTP or HTTPS through the same ports so that Traffic Manager can route requests to any of them. However. it only tests for an HTTP “200 OK” response from the monitoring endpoint within five seconds. Depending on the DNS TTL value you specify in Traffic Manager. If you want to perform more thorough tests to confirm correct operation. Distributing. and the type of client application in use. Do not assume cached data is available. There will always be some delay before rerouting takes place when a failure occurs or when global Internet connectivity and latency changes. Traffic Manager can affect how caching works. they must be running in the production environment. If you expose a specific page as a monitoring endpoint. you can perform a Virtual IP Address (VIP) Swap to move hosted applications or service into production without affecting an existing Traffic Manager policy. Users may be routed to a datacenter where the data the application uses may not be fully consistent with that in another datacenter. and that applications are designed to gracefully handle any inconsistencies. The section “Optimizing the Response Time and Throughput for Cloud Applications by Using Azure Caching” later in this chapter discusses caching policies for Azure-hosted application and services. the number of intermediate DNS servers caching the DNS record. Users may be routed to a different datacenter where the data in the Azure Cache. it must exist at the same location in every hosted application or service instance defined in the policy. Traffic Manager does not test the application for correct operation. See Chapter 8. However. and Synchronizing Data. Also ensure that cross-datacenter synchronization occurs at suitable intervals. You can use the Round Robin policy to distribute requests to all instances of the application that are currently responding to requests (instances that have not failed). ensure that the monitoring request (which occurs by default every 30 seconds) does not unduly affect the operation of your application or service. Traffic Manager does not take into account real-time changes in network configuration or load. Due to the delay in DNS propagation. you can use the Failover policy to .” for more information about data synchronization patterns. Alternatively. and gracefully handle any inconsistencies. You cannot add hosted applications or services that are staged. may not be fully consistent with that in the other datacenter. you should expose a specific monitoring endpoint and specify this in the Traffic Manager policy. Clients will not be redirected to a different application or service instance until the cached the DNS entry expires. Traffic Manager can affect how data management and synchronization works if you deploy data in more than one datacenter.       Maximizing Availability for Cloud Applications with Azure Traffic Manager Traffic Manager provides two policies that you can use to maximize availability of your applications.

you can offer the maximum availability and functionality at all times. and the more instances you configure the lower the average load on each one will be. You can maximize availability and scale it out simply by adding more role instances. These two policies provide opportunities for two very different approaches to maximizing availability:  The Round Robin policy enables you to scale out your application across datacenters to achieve maximum availability. and you should consider carefully how many role instances to deploy in each datacenter. For example. Unlike the Performance and Round Robin policies. which may be different from the main highest priority instance. you are charged for each instance in every datacenter.  The Failover policy enables you to deploy reserve or back-up instances of your application that only receive client requests when all of the instances higher in the priority list are offline. There is little reason to use the Round Robin policy if you only deploy your application to one datacenter. you are charged for each instance in every datacenter. Figure 3 shows an example of this approach. However. you may deploy a backup version that can still accept customer orders when the order processing system is unavailable. However.ensure that a backup instance of the application will receive requests should the primary instance fail. Requests will go to an instance in a datacenter that is online. this policy is suitable for use when you deploy to only one datacenter as well as when deploying the application to multiple datacenters. even if services and systems that the application depends on should fail. and you should consider carefully how many role instances to deploy in each datacenter. . A typical scenario for using the Failover policy is to configure an appropriate priority order for one or more instances of the same or different versions of the application so that the maximum number of features and the widest set of capabilities are always available. By arranging the priority order so that reserve instances in different datacenters are used as well as reduced functionality backup versions. the Failover policy is useful if you only deploy to one datacenter because it allows you to define reserve or backup instances of your application. However. but stores them securely and informs the customer of a delay.

and upgrade periods. If you use the Failover policy consider including hosted application or service instances that provide limited or different functionality that will work when services or systems the application depends on are unavailable in order to maximize the users experience as far as possible. it is useful for taking services offline during maintenance. Traffic Manager chooses the application nearest the top of the list your configure that is online.  Choose the Round Robin policy if you want to distribute requests evenly between all instances of the application. Also see the sections “General Guidelines for Using Azure Traffic Manager” and “General Limitations of Using Azure Traffic Manager” earlier in this chapter. If you use the Round Robin policy ensure that all of the hosted application or service instances are identical to that users have the same experience irrespective of the instance to which they are routed. This policy is typically suited to scenarios where you want to provide backup applications or services. It may also cause problems if you are synchronizing data between datacenters because the data in every datacenter may not be consistent between requests from the same client.    . Choose the Failover policy if you want requests to go to one instance of your application and service. and only change to another instance if the first one fails.Figure 3 Using the Failover policy to achieve maximum availability and functionality Guidelines for Using Azure Traffic Manager to Maximize Availability The following list contains guidelines for using Traffic Manager to maximize availability. However. testing. This policy is typically not suitable when you deploy the application in datacenters that are geographically widely separated as it will cause undue traffic across longer distances.

Because a number of instances of the application will be lightly loaded or not servicing client requests (depending on the policy you choose). However. you can reduce the overhead associated with repeatedly accessing remote data. and as soon as you make these copies you have concerns about what happens if you modify this data. You can enable and disable individual hosted application and services within the policy as required so that requests are directed only to those that are enabled. In this way. and you can cache data in the same datacenter that hosts your code. it will likely slow your system down due to the network latency involved in connecting to the cache in the cloud. and perform testing of deployed instances. update hosted instances. applications that use caching effectively should be designed to cope with data that may be stale but that eventually becomes consistent. Consider using the Failover or Round Robin policy when you want to perform maintenance tasks. Azure caching runs in the cloud. described earlier in this chapter. see “Windows Azure Traffic Manager” at http://msdn. to manage the number of role instances of each application deployed in each datacenter to minimize runtime cost. Do not use it for code that executes on-premises as it will not improve the performance of your applications in this environment. eliminate the network latency associated with remote data access. consider using a mechanism such as that provided by the Auto-scaling Application Traffic Manager will act as though all were online and continue to route requests to each configured instance in turn. this is especially true on the Internet where you also have to consider the possibility of network errors causing updates to fail to propagate quickly. and to gain maximum benefit you implement Azure caching in the same datacenter that hosts your code. reliable mechanism that enables you to retain frequently-used data physically close to your applications and services. although caching can improve the response time for many operations.  If all of the hosted applications or services in a Round Robin policy are offline or unavailable. . such as web and worker Traffic Manager will act as though the first one in the configured list is online and routes all requests to this instance. In fact. Consequently. Azure caching is primarily intended for code running in the cloud. For more information about Traffic Manager. So. you should create a separate cache in each datacenter. but it can take time for these updates to ripple around the system. Optimizing the Response Time and Throughput for Cloud Applications by Using Azure Caching Azure caching provides a scalable. it can also lead to issues of consistency if two instances of an item of data are not identical. Caching data means creating one or more copies of that data. Any updates have to be replicated across all copies. and each service should access only the co-located cache. and improve the response times for applications referencing this data. If all of the hosted applications or services in a Failover policy are offline or unavailable. If you deploy services to more than one datacenter. caching does not come without cost.

Sum the results for each type of object to obtain the required cache size. be aware that the longer an object resides in cache the more likely it is to become inconsistent with the original data source (referred to as the “authoritative” source) from which it was populated. objects expire after 48 hours and will then be removed. However. you can currently perform up to 40000 cache reads and writes. occupying 44800 MB of bandwidth. You should also consider the lifetime of objects in the cache. Note that the bigger the cache size the greater the monthly charge. but if you reduce the cache size some objects may be evicted. The purpose of these quotas is to ensure fair usage of resources. although you can override it on an object by object basis when you store them in the cache. Also. You cannot change this expiration period for the cache as a whole. the more of these resources are available. Round this value up to the next nearest value of 1024 (the cache is allocated to objects in 1KB chunks). the change is not immediate and you can only request to resize the cache once a day. Multiply this value by the maximum number of instances that you anticipate caching. . The portal enables an administrator to select the location of the Caching service and specify the resources available to the cache. if you select a 128MB cache.aspx#CACHING_FAQ. Add a small overhead (approximately 1%) to allow for the metadata that the Caching service associates with each object. 2. However. You can find information about the current production quota limits and prices at http://msdn. An administrator can easily provision an instance of the caching service by using the Azure Management Portal. To assess the amount of memory needed. 3. and imposes limits on the number of cache reads and writes per hour. for maximum cost-effectiveness you should carefully estimate the amount of cache memory your applications will require and the volume of activity that they will generate. You can create as many caches as your applications require. By default. Measure the size in bytes of a typical instance of the object (serialize objects by using the NetDataContractSerializer class and write them to a file). you can increase the size of a cache without losing objects from the cache. and supporting 160 concurrent users each hour. and they can be of different sizes. the bigger the cache. per hour. Note that the Azure Management portal enables you to monitor the current and peak sizes of the If you select a 4GB cache you can perform up to 12800000 reads and writes. for each type of object that you will be storing: 1. you do not have to install any additional software or implement any infrastructure within your organization to use it. and you can change the size of a cache after you have created it without stopping and restarting any of your services. and the number of concurrent connections. The values specified here are correct at the time of writing. You indicate the resources to provision by selecting the size of the cache. occupying up to 1400MB of and Sizing an Azure Cache Azure caching is a service that is maintained and managed by Microsoft. the available bandwidth per hour. Azure caching supports a number of predefined cache sizes. However. spanning up to 10 concurrent connections. 4. ranging from 128MB up to 4GB. For example. The size of the cache also determines a number of other quotas. but these quotas are constantly under review and may be revised in the future.

see “Understanding and Managing Connections (Windows Azure AppFabric Caching)” at http://msdn.You should also carefully consider the other elements of the cache quota. applications will receive an exception the next time they attempt to access the cache. It is also assists you in building highly scalable and resilient services that exhibit reduced affinity with the applications that invoke them. The onpremises application can be directed to any instance of the web such as a product catalog or a list of customer addresses. the same pool of connections is shared for a single application instance. any subsequent read and write operations will fail with an exception. so you can partition your data according to a cache profile. if you exceed the number of cache reads and writes permitted in an hour. For example. If the number of client requests increases over time. Implementing Services that Share Data by Using Azure Caching The Azure caching service implements an in-memory cache. which can be shared by multiple concurrent services. but you should consider how this affects your total connection requirements based on the number of instances of your application that may be running concurrently. If you reach the connection limit. You are not restricted to using a single cache in an application. new instances of the web role can be started up to handle them. and the system scales easily. Azure caching enables an application to pool connections. if you exceed the bandwidth quota. Figure 4 illustrates this located on a cache server in an Azure datacenter. while larger objects that are accessed constantly by a large number of concurrent instances of your applications can be held in a 2GB or 4GB cache. For example. Each instance of the Azure caching service belongs to a Service Bus namespace. the same application can make subsequent requests to query and maintain this customer information without depending on these requests being directed to the same instance of the Azure web role. where an on-premises applications employs the services exposed by instances of a web role.aspx. your applications will not be able to establish any new connections until one or more existing connections are closed. small objects that are accessed infrequently can be held in a 128MB cache. Similarly. and the same cached data is still available. an application may call an operation in a service implemented as an Azure web role to retrieve information about a specific customer. For more information. Using connection pooling can improve the performance of applications that use the caching service. . Each cache can have a different size. When connection pooling is configured. and if necessary select a bigger cache size even if you do not require the volume of memory indicated. Copying this data from a database into a shared cache can help to reduce the load on the database as well as improving the response time of the applications that use this data. It is ideal for holding immutable or slowly changing data. If this information is copied to a shared cache. and you can create multiple namespaces each with its own cache in the same datacenter.

You can generate the necessary client configuration information from the Azure Management portal. For more information about configuration web applications .microsoft. For more information. You can specify which cache to connect to either programmatically or by providing the connection information in a dataCacheClient section in the web application configuration file. you can also implement your own custom serializer of you need to optimize the serialized form of your objects in the and then copy this information directly into the configuration file. the Azure caching APIs store data in the cache by using the NetDataContractSerializer class. If the object is not currently stored in the cache. see “How to: Use a Custom Serializer with Windows Azure AppFabric Caching” at http://msdn. and then store it in the cache. By default. a web application can query the cache to find an object. to store an object in the cache. and if the object is present it can be retrieved. construct the object using this data. However. it must be serializable. These APIs are optimized to support the “Cache-aside” pattern.aspx. Therefore. the web application can retrieve the data for the object from the authoritative store (such as a SQL Azure database).Figure 4 Using Azure Caching to provide scalability Web applications access a shared cache by using the Azure caching APIs.

but be aware that if the cache is being shared. When a web application updates the data for an object it has retrieved from the cache. Although this approach guarantees the consistency of the cached data. If a web application attempts to retrieve and lock an object that is already locked by another instance. As described in the section “Provisioning and Sizing an Azure use Azure caching. When the object is updated and returned to the cache. see “How to: Configure a Cache Client using the Application Configuration file (Windows Azure AppFabric)” at http://msdn. and resolve the conflict using whatever logic is appropriate to the business logic (maybe present the user with both versions of the data and ask which one to save). The pessimistic approach takes the counter view. the object is removed and its resources reclaimed. this is identical to the update problem that you meet in any shared data otherwise the web application should assume that another instance has already modified this object. . For detailed information on using Azure caching APIs see “Developing Cache Clients (Windows Azure AppFabric Caching)” at http://msdn. If memory starts to run short. and although simple in theory the implementation inevitably involves a degree of complexity to handle the possible race conditions that can occur. However. the lock is The optimistic approach is primarily useful if the chances of a collision are small.aspx. fetch the new data.” an administrator species the resources available for caching data when the cache is created. the Azure caching service will evict data on a least-recently used basis. more than one instance of an application might attempt to update the same information. To assist with this situation. With the Azure caching service. When an object is updated.aspx. All cached objects can have an associated version number. when this time expires. so it locks the data when it is retrieved from the cache to prevent this situation from arising. ideally. it can check the version number of the object in the cache prior to storing the changes.  Pessimistic. and a developer can specify a period for caching an object when it is stored. with locking. any update operations should be very quick and the corresponding locks of a very short duration to minimize the possibility of collisions and to avoid web applications having to wait for extended periods as this can impact the response time and throughput of the application. it assumes that more than one instance of a web application is highly likely to try and simultaneously modify the same data. If the version number is the it should be assigned a new unique version number when it is returned to the cache. Updating Cached Data Web applications can modify the objects held in it will fail (it will not be blocked). cached objects can also have their own independent lifetimes. your applications are not notified when an object is evicted from the cache or expires. it can store the data. with versioning. The web application can then back off for a short period and try again. so be warned. the Azure caching APIs support two modes for updating cached data:  Optimistic.

perhaps by including version information in the data and verifying that this version number has not changed when the update is performed. The nature of the Azure caching service means that it is essential that you incorporate comprehensive exception-handling and recovery logic into your web applications. If the application does not release the lock within this period. remove the data from the cache in the same datacenter as the web application.” Both approaches are illustrated later in this chapter. The purpose of removing the data from cache rather than simply updating it is to reduce the chance of losing changes made by other instances of the web application at other sites or to introduce inconsistencies if the update to the authoritative data store is unsuccessful. For more information. in the section “Guidelines for Using Azure Caching. The logic that updates the authoritative data source should be composed in such a way as to minimize the chances of overwriting a modification made by another instance of the application. “Replicating. Synchronization necessarily generates network traffic which in turn is subject to the vagaries and latency issues of the Internet.An application specifies a duration for the lock when it retrieves data. but so long as to cause other instances to wait for access to this data for an excessive time. Depending on how you implement the logic that stores data in the cache. You should stipulate a period that will give your application sufficient time to perform the update operation. it may be preferable to update the authoritative data source directly. This feature is intended to prevent an application that has crashed from locking data indefinitely. and let the cached data at each remaining site expire naturally.aspx. If you are hosting multiple instances of the Azure caching service across different datacenters. the update problem becomes even more acute as you may need to synchronize a cache not only with the authoritative data source but also other caches located at different sites. when it can be repopulated from the authoritative data source. For example:  A race-condition exists in the simple implementation of the cache-aside pattern which can cause two instances of a web application to attempt to add the same data to the cache. To some extent you can hide this complexity and aid reusability by building the caching layer as a library and abstracting the code that retrieves and updates cached data. or it can cause the instance to fail with a DataCacheException exception (if you use the Add method of the cache). you can implement a custom solution by using Service Bus Topics using a variation on the patterns described in the section “Replicating and Synchronizing Data Using Service Bus Topics and Subscriptions” in Chapter 8.” Incorporating Azure caching into a web application must be a conscious design decision as it directly affects the update logic of the application. and Synchronizing Data. Distributing. the lock is released by the Azure caching In many cases. this can cause one instance to overwrite the data previously added by another (if you use the Put method of the cache). . If you require a more immediate update across sites. The next time this data is required. but you must still implement this logic somewhere. a consistent version of the data will be read from the authoritative data store and copied to the see the topic “Add an Object to a Cache” at http://msdn.

Therefore. make the changes to the shared cache as well as the local copy. If your application exceeds the quotas associated with the cache size. a reference to this object is returned immediately without contacting the shared cache. However. If a local cache is enabled. if you need to modify this data. The purpose of a local cache is to optimize repeated read requests to cached data. Be prepared to catch exceptions when attempting to retrieve locked data and implement an appropriate mechanism to retry the read operation after an appropriate interval. see the . when an application requests an object. If the data held in a shared cache is highly dynamic and consistency is important. You specify the size of the cache when it is created. You enable local caching by populating the LocalCacheProperties member of the DataCacheFactoryConfiguration object that you use to manage your cache client configuration. your application may no longer be able to connect to the cache. A local cache resides in the memory of the application. If it is. For more information. reading its values and modifying the data that it contains. you may find it preferable to use the shared cache rather than a local cache. the application will not see these changes until the local version of the item is removed from the local cache. any changes made are applied only to this local copy and are not automatically propagated back to the shared cache. It operates in tandem with the shared cache.   Implementing a Local Cache As well as the shared cache. and as such it is faster to access. the application can then access it in the same manner as a local object. or declaratively in the application configuration file. perhaps by using the Transient Fault Handling Application Block. Ideally. the caching client library fetches the data from the shared cache and then stores a copy of this object in the local cache. For this reason you should configure the local cache to only store objects for a short time before refreshing them. you should only use the local cache as a mechanism to optimize repeated reads to the same data. Once an item has been cached locally. the caching client library first checks to see whether this object is available locally. if the object is not found in the shared cache. although using a local cache can dramatically improve the response time for an application. You should log these exceptions. the local version of this item will continue to be used until it expires or is evicted from the cache. You should treat a failure to retrieve data from the Azure caching service as a cache miss and allow the web application to retrieve the item from the authoritative data source instead. the local cache can very quickly become inconsistent if the information in the shared cache changes. Of course. then the application must retrieve the object from the authoritative data source instead. You can specify the size of the cache and the default expiration period for cached items. The application then references the object from the local cache. and if they occur frequently an administrator should consider increasing the size of the cache. You can perform this task programmatically. If the object is not found in the local cache. After an item has been copied to local cache. if the item is modified in the shared cache. and the storage for the cache is allocated directly from the memory available to the application. A local cache is not subject to the same resource quotas as the shared cache managed by the Azure caching service. you can configure a web application to create its own local cache.

topic “Enable Windows Server AppFabric Local Cache (XML)” at http://msdn. employing the same code as an ordinary web It supports concurrent access to same session state data for multiple readers and a single writer. Like the DistributedCacheSessionStateStoreProvider class. You can configure this provider either through code or by using the application configuration file. The output cache is stored externally from the worker process running the web application and it is not lost if the web application is restarted.aspx. the DistributedCacheOutputCacheProvider class is completely transparent. Using an Azure cache to hold session state gives you several advantages:    It can share session state among different instances of applications providing improved scalability. you do not have to make any changes to your code. you do not need to use the Azure caching Guidelines for Using Azure Caching The following scenarios describe some common scenarios for using Azure caching: . if your application previously employed output caching.aspx. Once the provider is configured. For more information. you can build scalable web applications that take advantage of the Azure caching service for caching the HTTP responses that they generate for web pages returned to client applications.NET” at http://msdn.aspx.NET” at http://msdn. It can use compression to save memory and see “How to: Configure the AppFabric Output Cache Provider for ASP. you can generate the configuration information by using the Azure Management portal and copy this information directly into the configuration file. including:    You can cache larger amounts of output data. you access it programmatically through the Session object. Again. and It can use compression to save memory and bandwidth. Caching Web Application Session State The Azure caching service enables you to use the DistributedCacheSessionStateStoreProvider session state provider for ASP. This provider has several advantages over the regular per-process output cache.NET web applications and services. you can store session state in an Azure With this provider. see “How to: Configure the AppFabric Session State Provider for ASP. For more information. you can generate the information for configuring this provider by using the Azure Management portal and copy this information directly into the application configuration file. and this cache can be shared by multiple instances of an application. Caching HTML Output The DistributedCacheOutputCacheProvider class available for the Azure caching service implements output caching for web applications. Using this provider.

you can configure the Azure caching service running in the same datacenter that hosts the web applications and services (implemented as web or worker roles). This data is queried frequently. but rarely modified. The web applications only query customer addresses. you can specify a long expiration period for each item as it is cached. Figure 5 shows a possible structure for this solution. The expiration period for each item in the cache is set to 24 hours. the information used by the web applications is cached by using the Azure caching service. and if it is not found then it can be retrieved from the authoritative data store and copied to cache. a series of web applications implemented as web roles. require access to customer addresses held in a SQL Server database located on-premises within an organization. although other applications running on-premises may make the occasional modification. and web applications only access the cache located in the same datacenter. Each datacenter contains a separate instance of the caching service. If the data is static. so any changes made to this data will eventually be visible to the web applications. Each web application or service can implement the cacheaside pattern when it needs a data item. the period should reflect the frequency with which the data may be modified and the urgency of the application to access the most recently updated information. Objects representing data that might change in the authoritative data store should be cached with a shorter expiration time. . it can attempt to retrieve the item from cache. To reduce the network latency associated with making repeated requests for the same data across the Internet. and the cache is configured with sufficient memory. The same data may be required by all instances of the web applications and services. only cache data that is unlikely to change frequently. hosted in different datacenters. Web applications and services running in the cloud require fast access to data. To take best advantage of Azure caching. In this simple scenario. This is the ideal case for using Azure caching. In this example.

In this example. . and they may frequently modify this data. this is a twostep process. In the most straightforward case. you can implement either the optimistic or pessimistic strategy for updating the cache as described in the earlier section “Updating Cached Data”. and makes the corresponding change to the authoritative data store. Depending on the likelihood of a conflicting update being made by a concurrent instance of the application. the on-premises Customer database is the authoritative data store. it retrieves the item from cache (first fetching it from the authoritative data store if necessary). when a web application needs to update an object.Figure 5 Caching static data to reduce network latency in web applications  Web applications and services running in the cloud require fast access to shared data. the distribution of the web applications and services. depending on the location of the data. modifies this item in cache. However. the frequency of the updates. and the urgency with which the updates must be visible to these web applications and services. Figure 6 depicts this process. and to minimize the chances of a race condition occurring all updates must follow the same order in which they perform these steps. This scenario is a potentially complex extension of the previous case.

Figure 6 Updating data in the cache and the authoritative data store The approach just described is suitable for a solution contained within a single datacenter. but the various caches may be inconsistent with each other and the authoritative data source for some time. As described in the section “Updating Cached Data”. However. Implement a custom solution by using Service Bus Topics similar to that described in the section “Replicating and Synchronizing Data Using Service Bus Topics and Subscriptions” in Chapter 8. depending on the expiration period applied to the cached data. you have at least two options available for tackling this problem: ◦ Only update the authoritative data store and remove the item from the cache in the datacenter hosting the web application. The next time this data is required. it will be retrieved from the authoritative store and used to repopulate the cache. ◦ The first option is clearly the simpler of the two. Now updates have to be carefully synchronized and coordinated across datacenters and all copies of the cached data modified. you should implement a cache at each datacenter. if your web applications and services span multiple sites. the web applications and services may employ a local SQL . The data cached at each other datacenter will eventually expire and be removed from cache. Additionally.

The web application updates the authoritative data store (the on-premises database). the web application duplicates this modification to the data held in-cache. not only does the data have to expire in the cache. Figure 8 illustrates one possible implementation of this approach.Azure database rather than accessing an on-premises installation of SQL Server. The receiver at the datacenter hosting the web application that initiated the update will also receive the update message. a web application retrieves and caches data in the prescribed manner. . This strategy reduces the network latency associated with retrieving the data when populating the cache at the cost of yet more complexity if web applications modify this data. In this scenario. You might include additional metadata in the update message with the details of the web application that posted the message. they update the local SQL Azure database. and these updates must be synchronized with the SQL Azure databases at the other datacenters. e. b. These SQL Azure databases can be replicated and synchronized in each datacenter as described in Chapter 8. Receiver applications running at each datacenter subscribe to this topic and retrieve the update messages. If the database update was successful. c. Performing a data update involves the following sequence of tasks: a. Figure 7 shows an example of this solution with replicated instances of SQL Azure acting as the authoritative data store. Depending on how frequently this synchronization occurs. cached data at the other datacenters could be out of date for some considerable time. In this example. The receiver application applies the update to the local cache (if the data is currently cached locally). the receiver can then include logic to prevent it updating the cache unnecessarily. tuning the interval between database synchronization events as well as setting the expiration period of cached data is crucial if a web application must minimize the amount of time it is prepared to handle stale information. Figure 7 Propagating updates between Azure caches and replicated data stores Implementing a custom solution based on Service Bus Topics and Subscriptions is more complex. it also has to wait for the database synchronization to occur. but results in the updates being synchronized more quickly across datacenters. The web application posts an update message to a Service Bus Topic. d.

The data is not shared by other instances of the web application. the application can simply modify the data in the authoritative data source and. apply the same changes to the cached data in-memory. Figure 8 Propagating updates between Azure caches and an authoritative data store  A web application requires fast access to the data that it uses. As the data is not shared. This configuration was described in the section “Implementing a Local Cache” earlier in this chapter. This approach also enables you to quickly switch to using a shared cache without modifying your code. you simply reconfigure the data cache client settings. You can implement this solution in many ways. if successful. but the most convenient and extensible approach is probably to use the Azure caching APIs. each receiver application could update the local instance of SQL Azure as well as modifying the data in-cache. the authoritative data source is located on-premises. In this case. updates are straightforward. the data is effectively private to an instance of the web application and can be cached in-memory in the application itself rather than by using the Azure caching Service. but this model can be extended to use replicated instances of SQL Azure at each datacenter. . and to configure the application as an Azure cache client and enabling the local cache properties. In this scenario.Note that in this example.

Some updates may occur. the cached data is lost. data in the local cache expires in a manner similar to that of a shared cache except the default expiration period is much shorter—5 minutes. two or more instances of a web application cache data locally. and in this example the data referenced by each instance overlaps. if one instance modifies the data and writes the changes to the authoritative data store the cached data at the other instance is now out-of-date. In this case. caching the data inmemory in each instance makes them independent from each other for query purposes.In a variant on this scenario. It does not matter whether the web application instances are located in the same or different datacenters. Caching data in-memory in the web application speeds access to this data. If immediate synchronization between instances of the web is important. you can modify the default expiration time for a local cache. This is essentially the same problem addressed earlier with multiple shared caches. if the application fails. but beware of attempting to retain objects in a local cache for a lengthy period as they might become stale very quickly. then using a local cache configured with a suitable lifetime for cached objects may be appropriate. Figure 9 shows an example of this scenario with several instances of a web application using a local cache to optimize access to data held in an on-premises database. You should carefully balance these benefits and concerns against the requirements of your application. especially if they attempt to cache large amounts of data. You should also be aware of the increased memory requirements of your applications and the potential charges associated with hosting applications with an increased memory footprint. but they access overlapping data from the authoritative data store. If applications can handle stale data for a short while. Unlike a shared cache. . Therefore the cached objects are configured with a suitable expiration period to enable them to be refreshed appropriately and prevent them from becoming too stale. You can still override this period as you cache each object. However. then caching data inmemory is not the most suitable approach and it is best to stick with a shared cache. but as described earlier in can reduce the consistency of your solution and also introduce vulnerabilities.

session state is stored in-memory within the web application by default. the session state information must not be lost. As the load on a web application increases. The DistributedCacheSessionStateStoreProvider session state provider enables you to configure a web application to store session state out-of-process.) Additionally. if the web application fails and the web role is restarted. if you are using ASP. see the section “Managing Elasticity in the Cloud by Using the Enterprise Library Auto-scaling Application Block” earlier in this chapter. This phenomenon is probably undesirable in a scalable web application. these scalability and reliability features assume that a client using the web application can connect to any instance of the web application. a client connecting to different instances of the web application at different times may see different session state information each time it connects. One of the primary reasons for using Azure to host web applications is the scalability that this platform provides.Figure 9 Implementing local in-memory caching  You have built a web application hosted by using an Azure web role.NET to build a web application. If the web application uses sessions and stores session state information. then you must avoid tying this state information to a specific instance of the application. you can use a technology such as the Enterprise Library Auto-scaling Application Block to automatically start new instances and distribute the work more evenly (for more information. For example. However. The web application needs to cache session state information. using the Azure caching service as the storage . but this information must not be pinned to a particular instance of the web role. the reliability features of Azure ensure that an application can be restarted if it should fail for some reason. In this model.

each datacenter should be configured with its own shared cache. Different instances of the web application can then access the same session state information.NET Pages” at http://msdn.mechanism. the Traffic Manager Round Robin policy and some edge-cases of the Performance policy may redirect a client to a different datacenter holding different session state for some requests. refer to the section “Caching Web Application Session State” earlier in this chapter. For example. The output generated by an ASP.NET web page can be cached at the server hosting the web application and subsequent requests to access the same page with the same query parameters can be satisfied by responding with the cached version of the page rather than generating a new This provider is transparent to the web application which can continue to use the same ASP. see “Caching ASP. For more information about how This is the classic scenario for implementing output caching. You need to improve the response time of various pages served by this application.NET output caching works and how to use it. . and avoid repeatedly performing the same processing when different clients request pages.NET syntax to access session state information. For more information. as shown in Figure 10. This may have an impact on your solution if you are using a technology such as Traffic Manager to direct client requests to web applications. Figure 10 Client requests obtaining different session state from different datacenters  You have built a web application that performs complex processing and rendering of results based on a series of query parameters.100). Note that while the DistributedCacheSessionStateStoreProvider session state provider enables instances of web applications running in the same datacenter to share session data.

Guidelines for Securing Azure Caching You access an Azure cache through an instance of the Azure caching service. For more information about the differences between Azure caching and Windows Server AppFabric Caching. If there is insufficient space in the cache for a new object. Limitations of Azure Caching The features provided by the Azure caching service are very similar to those of Windows Server AppFabric Caching. removing this duplication of space and effort. In the Azure environment a web server equates to an instance of a web role. older objects will be evicted following the least-recently-used principle. and has endpoints with URLs that are based on the name of the Service Bus namespace with the suffix “.    You should also note that an Azure cached automatically benefits from the reliability and scalability features of Azure. A cache implemented by using the Azure caching service cannot contain user-defined named regions.cache. For more information. As with the session cache.However. see the section “Caching HTML Output” earlier in this chapter. then each web role instance may end up duplicating this processing and storing a copy of the same and you cannot modify this setting for the cache as a whole. Your applications connect . You cannot partition cached data. the Azure caching service has the following limitations compared to Windows Server AppFabric Caching:   It does not support Objects expire in the shared cache after 48 hours. However the Azure implementation provides only a subset of the features available to the Windows Server version. you can override this value on an object-by-object basis as you store them in the cache. You cannot change the default expiration period for a shared cache. they share the same programmatic APIs and configuration methods. see the topic “Differences Between Caching On-Premises and in the Cloud” at http://msdn. the default output cache provider supplied with ASP. you do not have to manage these aspects yourself. intensive Currently. you can modify the default expiration period for a local cache (the default duration is 5 minutes) You cannot disable the eviction policy. Your applications are not informed if an object expires or is evicted from cache. Consequently. the high availability features of Windows Server AppFabric Caching are not available because they are not required in the Azure environment. However.NET operates on a per-server basis. The DistributedCacheOutputCacheProvider class enables web roles to store the output cache in a shared Azure cache. The Azure caching service is deployed to a datacenter in the”. so using the default output cache provider causes each web role instance to generate its own cached output. you should create and use a separate shared cache for caching output data at each datacenter. In contrast. You cannot add tags to cached data to assist with object searches. You generate an instance of the Azure caching service by using the Azure Management portal and specifying a Service Bus namespace. If the volume of cached output is large and each cached page is the result of heavy.

Configuring Scalability to Match Customer Demand Initially. Specifically. evenly spread across an area representing the geographical location of their expected “Authenticating Users and Authorizing Requests. As with other items in the Service Bus. the caching service always enforces message level security. Only web applications and services running in the cloud need to be provided with the authentication token for connecting to the Azure caching service as these are the only items that should connect to the cache. and Performance The Trey Research example application that is available in conjunction with this guide demonstrates many of the scenarios and implementations described in this chapter. The Trey Research solution is capable of spanning multiple datacenters and customers placing orders may be located anywhere in the world. You can download the example code by following the links at http://wag. Trey Research deployed the Orders application and made it available to a small but statistically significant number of users.codeplex. How Trey Research Maximizes Scalability. The Trey Research example solution is deployed to a single datacenter and is not configured to support auto-scaling or redirection. Regardless of whether you are accessing the cache using a basic HTTP connect or an SSL connection. now would be a good time to download the example so that you can examine the code it contains as you read through this section of the guide. Trey Research makes use of the Microsoft Enterprise Library Auto-scaling Application Block to start and stop instances of web and worker roles as the load from users dictates. Trey Research noticed that: . Utilizing an Azure cache from code running externally to the datacenter provides little benefit and is not a supported scenario. and identified periods during which performance suffered due to an increased load from customers. By gathering statistical usage information based on this the caching service using these URLs. Traffic Manager is also employed to detect the status of the web and worker roles running at each datacenter.” To connect to the caching service. The following sections describe how Trey Research implements these technologies to optimize the performance of the solution. Trey Research identified how the system functions at specific times in the working day. and the authentication token is also used to encrypt and sign messages. If you have not already done so. the solution also uses Azure caching at each datacenter to reduce the time required to access frequently-queried data. all connection requests from an application to the Azure caching service are authenticated and authorized by using ACS. such as the Products catalog. Trey Research uses Traffic Manager to help reduce the network latency of user requests by directing these requests at the datacenter closest to the users’ locations. and if necessary transparently redirect users’ requests to a different datacenter if a role becomes unresponsive or unavailable. To keep the system responsive. The sections “Configuring Scalability to Match Customer Demand” and “Directing Customer Requests to the Nearest Datacenter” are provided for information only. The caching service exposes endpoints that support basic HTTP connectivity (via port 22233) as well as SSL (via port 22243). as described in Chapter 5. an application must provide the appropriate authentication token. Availability.

allowing for the spread of users across the United States).   To cater for these peaks and troughs in demand.      Hosting the Auto-scaling Application Block The Auto-scaling Application Block monitors the performance of one or more roles. which are shut down at 18:00.WindowsAzure. . and you must arrange for this object to start running when your application executes. To perform this work. the system is constrained to allow a maximum of 4 instances of each role at each datacenter. the times at which the extra instances start and stop are two hours earlier. the Auto-scaling Application Block uses an Autoscaler object (defined in the Microsoft. and any additional instances above this number are shut down to reduce running costs. and stops the Autoscaler in the OnStop method: C# public class WorkerRole : RoleEntryPoint { private Autoscaler autoscaler. instances are shut down. the application was configured to start a further two instances of each role. When the system is quiescent or lightly loaded. The Trey Research solution performs this task in the Run method in the WorkerRole class.. When the CPU usage drops below 50%. Trey Research decided to implement the Enterprise Library Auto-scale Application Block as follows:  The developers configured constraint rules for the application to start an additional 3 instances of the web and worker roles at each datacenter at 14:45 Central time (it can take 10 or 15 minutes for the new instances to become available). The Auto-scaling Application Block also generates diagnostic information and captures data points indicating the work that it has the peak loading tended to start and finish 2 hours earlier (13:00 to 18:00 Central time). Trey Research also configured reactive rules to monitor the number of customer requests. subject to a minimum of 2 instances per datacenter. Many customers tended to place their orders towards the end of the working day (between 15:00 and 20:00 Central time. At weekends. On Fridays. very few customers placed orders. To handle unexpected demand. starting and stopping these roles as specified by the various constraint rules and reactive rules. .50).aspx. and start additional instances if the average CPU usage for a web role exceeds 85% for 10 or more minutes. At 16:45. On Fridays.Autoscaling namespace). At the weekend. up to a maximum of 12 instances per datacenter. and this information can be written to BLOB storage for later download and analysis.EnterpriseLibrary. see “Autoscaling Application Block Logging” at http://msdn.. with an especially large number placed between 17:00 and and to shut these instances down at 20:00 between Monday and Thursday. For more information about the information collected. the system returns to its baseline configuration comprising two instances of each role per datacenter.

Website" min="2" max="12"/> </actions> </rule> <rule name="Weekend" enabled="true" rank="10"> <timetable startTime="00:00:00" duration="23:59:59" . . For more information and an example of defining the service model for the Auto-scaling Application Block. according to the criteria specified earlier. } The information about which roles to monitor. and the location of the rules defining the behavior of the Autoscaler object are specified in the <serviceModel> section of the service information store file.0" encoding="utf-8" ?> <rules xmlns="http://schemas..Stop()..Start().microsoft.. Defining the Auto-scaling Rules The Trey Research solution implements a combination of constraint rules and reactive rules..0 Integration Pack for Windows Azure.config file for the worker role. } . according to the load on the CPU exhibited by the web role. This file was uploaded to BLOB storage and stored in the BLOB specified by the <serviceInformationStores> section of the app. this. see Chapter 5 “Making Tailspin Surveys More Elastic” of the Developer's Guide to the Enterprise Library 5.. } . ..autoscaler.autoscaler. public override void OnStop() { this. Trey Research defined the following set of rules: XML <?xml version="1. the storage account to use for storing diagnostic data. As an initial starting point..autoscaler = EnterpriseLibraryContainer. The reactive rules start further instances of the web role (and stop them).Workers" min="2" max="12"/> <range target=" entlib/autoscaling/rules"> <constraintRules> <rule name="Weekday" enabled="true" rank="10"> <timetable startTime="00:00:00" duration="23:59:59" utcOffset="-06:00"> <weekly days= "Monday Tuesday Wednesday Thursday Friday"/> </timetable> <actions> <range target="Orders.public override void Run() { this.Current. GetInstance<Autoscaler>(). The constraint rules specify the schedule at which the Autoscaler object should arrange for instances of the web and worker roles to be created.

Website" min="5" max="5"/> </actions> </rule> <rule name="MondayToThursdayPeak" enabled="true" rank="3"> <timetable startTime="16:45:00" duration="03:15:00" utcOffset="-06:00"> <weekly days="Monday Tuesday Wednesday Thursday"/> </timetable> <actions> <range target=" Orders.Workers" min="5" max="5"/> <range target=" Orders.Workers" min="5" max="5"/> <range target=" Orders.utcOffset="-06:00"> <weekly days="Sunday Saturday"/> </timetable> <actions> <range target="Orders.Website" min="5" max="5"/> </actions> </rule> <rule name="FridayPeak" enabled="true" rank="3"> <timetable startTime="14:45:00" duration="03:15:00" utcOffset="-06:00"> <weekly days="Friday"/> </timetable> <actions> .Website" min="2" max="4"/> </actions> </rule> <rule name="MondayToThursday" enabled="true" rank="2"> <timetable startTime="14:45:00" duration="05:15:00" utcOffset="-06:00"> <weekly days="Monday Tuesday Wednesday Thursday"/> </timetable> <actions> <range target=" Orders.Workers" min="7" max="7"/> <range target=" Orders.Workers" min="2" max="4"/> <range target="Orders.Website" min="7" max="7"/> </actions> </rule> <rule name="Friday" enabled="true" rank="2"> <timetable startTime="12:45:00" duration="05:15:00" utcOffset="-06:00"> <weekly days="Friday"/> </timetable> <actions> <range target=" Orders.

config.Workers" min="7" max="7"/> <range target=" Orders.<range target=" Orders. The Orders.cs: C# public class WebRole : RoleEntryPoint { .PerformanceCounters.config file for the worker web role was modified to collect this information by using the following code shown in bold in the StartDiagnostics method (called from the OnStart method) in the file WebRole. using the \Processor(_Total)\% Processor Time performance counter. to the BLOB specified by the <rulesStores> section of the app. . private static void StartDiagnostics() { var config = DiagnosticMonitor.ScheduledTranferPeriod = . The CPU operand referenced by the reactive rules calculates the average processor utilization over a 10 minute period.Website" min="7" max="7"/> </actions> </rule> </constraintRules> <reactiveRules> <rule name="HotCPU" enabled="true" rank="4"> <when> <greater operand="CPU" than="85" /> </when> <actions> <scale target="Orders.Website" by ="1"/> </actions> </rule> <rule name="CoolCPU" enabled="true" rank="4"> <when> <less operand="CPU" than="50" /> </when> <actions> <scale target="Orders.Website" by ="-1"/> </actions> </rule> </reactiveRules> <operands> <performanceCounter alias="CPU" source="AccidentReporting_WebRole" performanceCounterName= "\Processor(_Total)\% Processor Time" timespan="00:10:00" aggregate="Average"/> </operands> </rules> The rules were uploaded to BLOB storage.GetDefaultInitialConfiguration().

by implementing the Performance policy. Using automatic scaling is an iterative process.Timespan. and the Auto-scaling Application Block ensures that an appropriate number of instances of the Orders application will be available at each datacenter to facilitate good performance. the operations staff at Trey Research decided to use Traffic Manager simply to direct customers’ requests to the nearest responsive datacenter. DiagnosticMonitor. the Round Robin policy may conceivably direct two consecutive requests from the same customer to different datacenters.FromSeconds(30) }). Indeed.Add( new PerformanceCounterConfiguration { CounterSpecifier = @"\Processor(_Total)\% Processor Time". The selection of the Performance policy was very easy. and the performance of the solution is continuously monitored as the pattern of use by customers evolves.FromMinutes(1). With this in mind. config.DataSources. } The performance counter data is written to the WADPerformanceCountersTable table in Azure Table storage. The configuration defined by the team at Trey Research is kept constantly under review. the solution spans multiple datacenters so the Failover policy is not really suitable. } . so the Round Robin policy is unnecessary. Directing Customer Requests to the Nearest Datacenter Using the Auto-scaling Application Block helps to ensure that sufficient instances of the Orders application are running to service the volume of customers at any given point in time. SampleRate = TimeSpan. possibly leading to confusion if the data cached at each datacenter is not completely consistent. . Selecting the Performance policy brings the advantages of reducing the network latency while ensuring that requests from the same customer are much more likely to be directed towards the same datacenter.PerformanceCounters. config).. The operators may modify the configuration and auto-scaling rules to change the times and circumstances under which additional instances are created and destroyed in the future.. incurring additional network latency and impacting the response time of the application.Start( "DiagnosticsConnectionString". implementing the Round Robin policy may be detrimental to customers as they might be redirected to a more distant datacenter. and monitoring the Home page of the web application to determine .. The web role must be running in full trust mode to successfully write to this table.. The operation staff configured the policy to include the DNS addresses of the Orders application deployed to the US North and US South datacenters. Additionally.

Additionally. order details.treyresearch. The Trey Research example application provided in the download is only deployed to a single For example. while the cache in the US South datacenter is named TreyResearchCacheUSS .net" /> <Setting name="CachePort" value="22233" /> <Setting name="CacheAcsKey" value="<data omitted>" /> <Setting name="IsLocalCacheEnabled" value="false" /> <Setting name="LocalCacheObjectCount" value="1000" /> <Setting name="LocalCacheTtlValue" value="60" /> <Setting name="LocalCacheSync" value="TimeoutBased" /> </ConfigurationSettings> . The operations staff selected the DNS prefix ordersapp..Website"> .Azure" . provides the configuration parameters for accessing the cache in the service configuration file for the solution (ServiceConfiguration.. In this scheme. while they reasoned that caching order and customer details would bring few benefits.csfg).> . Caching Data to Reduce the Response Time of Customer Requests The Orders application uses several types of data. customer information. Defining and Configuring the Azure Cache A cache called TreyResearchCacheSuffix was created at each datacenter.. implemented in the Orders. The web application. product information is updated very infrequently.0" encoding="utf-8"?> <ServiceConfiguration serviceName="Orders. Order information is relatively dynamic.. <ConfigurationSettings> . <Setting name="CacheHost" value="TreyResearchCache. It is also reasonably static. the cache in the US North datacenter is called TreyResearchCacheUSN.. and the cache is named to the public address used by customers. This scheme ensures that each cache is given a unique and systematic name.cache. For these reasons. and customer details are accessed infrequently compared to other data (only when the customer logs in). a customer connecting to the URL http://store.trafficmgr. <Role name="Orders. XML <?xml version="1. the developers at Trey Research elected to cache the product catalog by using a shared Azure cache in each and the product is transparently redirected via Traffic Manager to the Orders application running in the US North datacenter or the US South datacenter.. The developers estimated that a 128MB cache would be sufficient..treyresearch.treyresearch. where Suffix identifies the name of the datacenter hosting the cache. However the product catalog is queried by every instance of the Orders application when the user logs in. and mapped the resulting address (ordersapp.Website project. Furthermore the same customer and order information tends not to be required by concurrent instances of the Order application. the product catalog can comprise a large number of items.availability. store.

Func<T> fallbackAction. These methods are defined by the IProductsStore interface: C# public interface IProductStore { IEnumerable<Product> FindAll(). } The FindAll method returns a list of all available products from the SQL Azure database. and the timeout parameter specified the lifetime of the object if it is added to the cache. } object Get<T>(string key. If the object is not currently cached. set. This library is capable of caching any of the data items defined by the types in the DataStores folder. These classes are located in the DataStores folder. Implementing Caching Functionality for the Products Catalog The Orders Application contains a generic library of classes for caching data in the DataStores\Caching folder. If the timeout parameter is null. TimeSpan? timeout) where T : class. None of these classes implement any form of caching..cs file provides methods for querying products. This is an interface that exposes a property named DefaultTimeout and a method called Get. as follows: C# public interface ICachingStrategy { TimeSpan DefaultTimeout { get. the ProductStore class in the ProductStore. but for the reasons described earlier caching is only implemented for the ProductStore class. an implementation of this . For example. Product FindOne(int productId). and the FindOne method fetches the product with the specified product id. In a similar vein.. The DataStores\Caching folder contains the following types:  ICachingStrategy. The key parameter of the Get method specifies the unique identifier of the object to retrieve from the cache. </Role> </ServiceConfiguration> Retrieving and Managing Data in the Orders Application The Orders application a set of classes for storing and retrieving each of the types of information it references. the OrderStore class implements the IOrdersStore interface which defines methods for retrieving and managing orders.. the fallbackAction parameter specifies a delegate for a method to run to retrieve the corresponding data. } This interface abstracts the caching functionality implemented by the library.

string sync) { . IDisposable { private readonly RetryPolicy cacheRetryPolicy.. The following code sample shows the important elements of this class: C# public class CachingStrategy : ICachingStrategy. int port..  CachingStrategy. long objectCount. . this. .... int ttlValue. Func<T> fallbackAction. } set { this. // Populate the factoryConfig object with // configuration data specified by the parameters // to the constructor . and invokes the constructor by using the Unity framework as described later in this chapter. } public TimeSpan DefaultTimeout { get { return this.defaultTimeout = value.. TimeSpan? timeout) . or the default timeout value for the CachingStrategy object. and if the object is found it is returned. the method invokes the delegate to retrieve the missing data and adds it to the cache specifying either the timeout value provided as the parameter to the Get method (if it is not null). public CachingStrategy(string host..interface should set the lifetime of the object to the value specified by the DefaultTimeout property. this.cacheFactory = new DataCacheFactory(factoryConfig). This class implements the ICachingStrategy interface.FromMinutes(10). GetDefaultAzureCachingRetryPolicy()..defaultTimeout. The constructor for this class uses the Azure Caching APIs to connect to the Azure cache using the values provided as parameters (the web application retrieves these values from the service configuration file. private TimeSpan defaultTimeout = TimeSpan. in the section “Instantiating and Using a ProductsStoreWithCache Object. string key. } } public virtual object Get<T>(string key.”) The Get method of the CachingStrategy class queries the cache using the specified key. private DataCacheFactory cacheFactory. If the object is not found.cacheRetryPolicy = RetryPolicyFactory. bool isLocalCacheEnabled. DataCacheFactoryConfiguration factoryConfig = new DataCacheFactoryConfiguration().

.Get(key)).. The static GetDefaultAzureCachingRetryPolicy method of the RetryPolicyFactory class referenced in the constructor returns the default policy for determining whether a caching exception is transient. } } } } .DefaultTimeout). The default policy retries implements the “Fixed Interval Retry Strategy” defined by the Transient Fault Handling Block.cacheRetryPolicy. } .where T : class { . } Notice that this class traps transient errors that may occur when fetching an item from the cache.Value : this. and providing a construct for indicating how such an exception should be handled.. return cachedObject.. objectToBeCached.Put(key. if (objectToBeCached != null) { try { dataCache. The Get property of the CachingStrategy class invokes the ExecuteAction method of the retry policy object. and if necessary will be retried following the settings defined by the retry policy object). return objectToBeCached.cacheFactory.. var objectToBeCached = fallbackAction().GetDefaultCache(). var cachedObject = this..ExecuteAction( () => dataCache. try { var dataCache = this.... passing it delegate that attempts to read the requested data from the cache (this is the code that may exhibit a transient error. and the web... } . if (cachedObject != null) { ... . by using the Transient Fault Handling Application Block.config file configures this strategy to retry to the failing operation up to 6 times with a 5 second delay between attempts. timeout != null ? timeout.

FromMinutes(10)).. retrieved by calling the fallbackAction delegate. productId). the exception handling strategy in the Get method (omitted from the code above) is to return the value from the underlying store. The static ContainerBootstrapper class contains the following code: . TimeSpan. () => this..Format( "ProductStore/Product/{0}". return (IEnumerable<Product>) this.cachingStrategy.FindAll().cachingStrategy. TimeSpan. This type provides the caching version of the ProductStore class.productStore.If a non-transient error occurs or an attempt to read the cache fails after 6 attempts.. this. () => this. private readonly ICachingStrategy cachingStrategy.Get( "ProductStore/FindAll".FromMinutes(10)).productStore = productStore. } public Product FindOne(int productId) { . return (Product)this. It implements the IProductsStore interface. see the section “Implementing Reliable Data Access with the Transient Fault Handling Block” in Chapter 5 of this guide. } public IEnumerable<Product> FindAll() { ..cachingStrategy = cachingStrategy. as highlighted by the following code sample: C# public class ProductStoreWithCache : IProductStore { private readonly IProductStore productStore. but internally employs an ICachingStrategy object to fetch data in the FindAll and FindOne methods.Get( string. For more information about how the Trey Research solution uses the Transient Fault Handling Block.FindOne(productId).  ProductStoreWithCache.productStore. ICachingStrategy cachingStrategy) { this. public ProductStoreWithCache( IProductStore productStore. } } Instantiating and Using a ProductsStoreWithCache Object The Orders application creates a ProductsStoreWithCache object by using the Unity Application Block.

replace the // CachingStrategy class with the strategy that // you want to use instead. var localCacheObjectCount = Convert.GetConfigurationSetting( "LocalCacheObjectCount". container.. container. null). GetConfigurationSetting("CacheHost".C# public static class ContainerBootstrapper { public static void RegisterTypes( IUnityContainer container) { . null). null)).. var localCacheTtlValue = Convert. ProductStoreWithCache>( new InjectionConstructor( new ResolvedParameter<ProductStore>(). Notice that the CachingStrategy class is configured to use the ContainerControlledLifetimeManager class of the Unity framework. localCacheObjectCount. new ResolvedParameter<ICachingStrategy>())). null)). var port = Convert.ToInt32( CloudConfiguration. port.GetConfigurationSetting( "LocalCacheSync". null)). GetConfigurationSetting("CachePort". localCacheTtlValue. isLocalCacheEnabled.RegisterType<IProductStore. This effectively ensures that the CachingStrategy object used by the application is created as a singleton that spans the life of the application.ToInt32(CloudConfiguration. var cacheAcsKey = CloudConfiguration. var host = CloudConfiguration.ToInt64( CloudConfiguration.RegisterType<ProductStore>(). } } These statements register the ProductStore and CachingStrategy objects.GetConfigurationSetting( "LocalCacheTtlValue". var isLocalCacheEnabled = Convert. GetConfigurationSetting("CacheAcsKey". CachingStrategy> ( new ContainerControlledLifetimeManager(). var localCacheSync = CloudConfiguration. new InjectionConstructor(host. null). and the Unity Application Block uses them to create a ProductStoreWithCache object whenever the application instantiates an IProductStore object. null)). localCacheSync)). cacheAcsKey. This is useful as the DataCacheFactory object that the CachingStrategy class encapsulates is very expensive .GetConfigurationSetting( "IsLocalCacheEnabled". container.RegisterType<ICachingStrategy. // To change the caching strategy.ToBoolean( CloudConfiguration.

public StoreController(IProductStore productStore) { .asax.productStore. } public ActionResult Details(int id) { var p = without modifying the business logic for the Orders application. and even elect not to cache data if caching is found to have no benefit. populating it if the requested data is not currently available in the cache. this. and the FindOne method when it needs to retrieve the details for a single product: C# public class StoreController : Controller { private readonly IProductStore productStore. as highlighted in bold in the following code example: C# public static class ContainerBootstrapper { .cs file. all you need to do is switch the type for the IProductsStore interface in the ContainerBootstrapper class. Additionally.productStore..FindAll(). You can change the caching configuration. The RegisterTypes method of the ContainerBootstrapper class is called from the SetupDependencies method in the } } This code transparently accesses the Azure cache. } public ActionResult Index() { var products = this. The SetupDependencies method also assigns the dependency resolver for the Orders application to the Unity container that registered these types.productStore = productStore. For more information about using the Unity Application Block see “Unity Application Block” at http://msdn. The StoreController class calls the FindAll method of the ProductStoreWithCache object when it needs to fetch and display the entire product catalog. return View(products).aspx. return View(p).FindOne(id). when the Orders application starts running. the parameters for the constructor for the CachingStrategy object are read from the configuration file and are passed to the CachingStrategy class by using a Unity InjectionConstructor object. so it is best to create a single instance of this class that is available throughout the duration of the application.and time-consuming to create..

at http://msdn.. you have learned about the Azure technologies that you can use to improve the and if not. The Azure Traffic Manager can play an important role in reducing the network latency associated with sending requests to a web application by transparently directing these requests to the most appropriate instance of the web application relative to the location of the client submitting these requests. Azure caching is an essential element in improving the responsiveness of your web applications and services. This technique removes much of the network latency associated with remote data “Caching Service (Windows Azure AppFabric)” at http://msdn.codeplex. “Enterprise Library Auto-scaling Application Block” at http://entlib. ProductStore>(). It enables you to cache data locally to these applications. The Azure Traffic Manager can also help to maximize availability by intelligently detecting whether an instance of a web application is redirecting requests to a different instance of the application. More Information      “Windows Azure Traffic Manager” at } } Summary In this chapter. you must be prepared to balance this improvement in performance against the possible complexity introduced by maintaining multiple copies of data. Azure provides a highly scalable environment for hosting web applications and services.aspx “Architecting Applications to use Windows Azure AppFabric Caching” at http://www.aspx . and the Enterprise Library Auto-scaling Application Block implements a mechanism that can take full advantage of this scalability by monitoring your web applications and automatically starting and stopping instances as the demand from clients requires.. However.. .public static void RegisterTypes( IUnityContainer container) { . Finally.. and performance of a hybrid in the same “Developing Applications for the Cloud on the Microsoft Windows Azure Platform”.RegisterType< availability.

. "Introduction" of this guide.For links and information about the technologies discussed in this chapter see Chapter 1.

requiring tools that can quickly gather performance data to help you analyze throughput and pinpoint the causes of any errors. to complete systemic failure and loss of connectivity between components whether they are running on-premises or in the cloud. not only when managing hybrid applications but also when deploying and updating the components of your systems. It is important to follow a systematic approach. the less impact they will have on the operations that your business depends on. through issues with the environment hosting individual elements. to keep them running well and fulfilling your obligations to your customers as well as assisting you in determining the capacity and running costs of your systems. Collecting diagnostic information about the way in which your system operates is also a fundamentally important part in determining the capacity of your solution. or other shortcomings in the system. . With this complexity. You should endeavor to minimize the performance impact of the monitoring and management process. By monitoring your system you can determine how it is consuming resources as the volume of requests increases or decreases. from simple failures caused by application errors in a service running in the cloud. However.10 Monitoring and Managing Hybrid Applications A typical hybrid application comprises a number of components. distributed across a range of sites and connected by networks of varying bandwidth and reliability. and on the customers using your system. and quickly take any necessary restorative action in the event of failure. and this in turn can assist in assessing the resources required and the running costs associated with maintaining an agreed level of performance. failures. built using a variety of technologies. you must be able to take the appropriate steps to correct it and keep the system functioning. The earlier you can detect issues and the more quickly you can fix them. monitoring a complex system is itself a complex task. The range of problems can vary significantly. and you should seek to avoid making the entire system unavailable if you need to update specific elements. it is very important to be able to monitor how well the system is functioning. and this in turn can affect any service level agreement (SLA) that you offer users of your system. This chapter describes the technologies and tools that Azure provides to help you monitor and manage your solutions in a proactive manner. Once you have been alerted to a problem.

not how well the network or the cloud environment is functioning. Even authorized non-paying visitors to your Web sites and services will anticipate that their experience will be trouble-free. Additionally. The following sections describe some common use cases for monitoring and maintaining a hybrid solution. One way to measure the capacity of your system is to monitor its performance under real-world conditions. You should ensure that your system is designed and optimized for the cloud as this will help you set realistic expectations for your users. This is clearly a much more challenging environment than the on-premises scenario. In a commercial environment.Use Cases and Challenges Monitoring and managing a hybrid application is a non-trivial task due to the number. Applications in the cloud may be multi-tenanted. and summarize many of the challenges you will encounter while performing the associated tasks. Customers care about the quality of service that your system provides. Operations may run asynchronously. There may be many instances of your services running across these servers. You must consider not just how to gather the statistics and performance data from each instance of each service running on each server. welldefined procedures for recovering elements in the event of failure are of paramount importance. However. and persist and analyze this data to identify . Many of the issues associated with effectively monitoring and managing a system also arise when you are hosting the solution on-premises using your organization’s servers. location. including:      The servers are no longer running locally and may be starting up and shutting down automatically as your solution scales elastically. and the running costs associated with these resources. Measuring and Adjusting the Capacity of Your System Description: You need to determine the capacity of your system so that you can identify where additional resources may be required. and by whom. customers paying to use your service expect to receive a quality of service and level of performance defined by their SLA. this requires you to establish an infrastructure that can unobtrusively collect the necessary information from your services and servers running in the cloud. but also how to consolidate this information into a meaningful view that an administrator can quickly use to determine the health of your system and ascertain the cause of any performance problems or failures. In turn. Gathering accurate and timely metrics is key to measuring the capacity of your solution and monitoring the health of the system. You may also be required to collect routine auditing information about how your system is used. and variety of the various moving parts that comprise a typical business solution. when you relocate services and functionality to the cloud the situation can become much more complicated for a variety of reasons. nothing is more off-putting to potential customers than running up against poor performance or errors caused by a lack of resources. Communications may be brittle and sporadic.

Recovering from Failure Quickly Description: You need to handle failure systematically and restore functionality quickly. software never crashes and everything always works. Remember that testing can only prove the presence and not the absence of bugs. and either repair these failures quickly or reconfigure the system to transparently redirect customer requests around them. It is important that the infrastructure flags such errors early so that operations staff can take the proper corrective actions rapidly and efficiently.any trends that can highlight scope for potential failure. In an ideal situation. This is unrealistic in the real world. but you should aim to give users the impression that your system is always running perfectly. and they must be robust enough to record and handle any errors that occur while these steps are being performed. response times. no matter how well-tested a system is there will be factors outside your control that can affect how your system functions. Monitoring Services to Detect Performance Problems and Failures Early Description: You need to maintain a guaranteed level of service. You can then take steps to address these trends. If designed carefully. Once you have determined the cause of a failure. In some cases you may also determine that elements of your system need to be redesigned to better handle the throughput required. diagnose their cause. unless you have spent considerable time and money performing a complete and formal mathematical verification of the components in your solution. a service processing requests synchronously may need to be reworked to perform its work asynchronously. deploying services to more datacenters in the cloud. and repeatable manner. the next task is to recover the failed components or reconfigure the system. you cannot guarantee that they are free of bugs. Logging Activity and Auditing Operations Description: You need to record all operations performed by every instance of a service in every datacenter . they should not be aware of any glitches that might occur. However. processing bottlenecks. For example. Ideally. and so on. or a different communications mechanism might be required to send requests to the service more reliably. documented. such as excessive request queue lengths. you must perform this task in a thoroughly systematic. the network being a prime example. and you should seek to minimize any disruption to service. In a live environment spanning many computers and supporting many thousands of users. the techniques and infrastructure you employ to monitor and measure the capacity of your system can also be used to detect failures. perhaps by starting additional service instances. The information gathered should also be sufficient to enable operations staff to spot any trends. the steps that you take should be scripted so that you can automate them. The key to maintaining a good quality of service is to detect problems before your customers do. and if necessary dynamically reconfigure the system to add or remove resources as appropriate to cater for any change in demand. Additionally. or modifying the configuration of the system.

performance variations. and so on. You should implement procedures that apply updates in a manner that minimizes any possible disruption for your customers. If you are charging customers for accessing your system. and other runtime occurrences. A performance log should provide sufficient data to help monitor and measure the health of the elements that comprise your system. errors. the system should never be left in an indeterminate state.You may be required to maintain logs of all the operations performed by users accessing each instance of your service. the system should remain available while any updates occur. For diagnostic information. However. Deploying and Updating Components Description: You need to deploy and update components in a controlled. An audit log should include a record of all operations performed by the operations staff. you may still need to track the resources accessed by each user for billing purposes. Performance Monitoring a system and gathering diagnostic and auditing information will have an effect on the performance of the system. such as a SQL Azure database nearing its configured capacity. Cross-Cutting Concerns This section summarizes the major cross-cutting concerns that you may need to address when implementing a strategy for monitoring and managing a hybrid solution. and a compact mechanism for transporting and storing it. Analytical tools should be available to identify trends that may cause a subsequent failure. so you should determine an efficient mechanism for collecting this data. the audit log should also contain information about the operations requested by your customers and the resources consumed to perform these operations. The policy for gathering auditing information is unlikely to be as flexible. all updates must be thoroughly tested in the cloud environment before they are made available to live customers. Logging may be a regulatory requirement of your system. permanent. . and reliable manner whilst maintaining availability of the system. and secure record of events. deployments. Therefore any solution should be flexible enough to allowing tuning and reconfiguration of the monitoring process as the situation dictates. it might not be necessary to gather extensive statistics. such as service shutdowns and restarts. In addition. and warnings concerning unusual activity such as failed logins. but even if is not. such as exceptions raised by failing components and infrastructure. The solution you use should minimize this impact so as not to adversely affect your customers. These logs should be a complete. enabling any changes to be quickly rolled back in the event of deployment failure. reconfigurations. repeatable. all component deployment and update operations should be performed in a controlled and documented manner. As in the use case for recovering from system failure. in a stable configuration. enabling the operations staff to perform the actions necessary to prevent such a failure from actually occurring. during critical periods collecting more information might help you to detect and rectify any problems more quickly. An error log should provide a date and time-stamped list of all the errors and other significant events that occur inside your system.

storing it with the diagnostic data. or making it available to the operators monitoring your system. You should take appropriate steps to prevent unauthorized staff from being able to access monitoring data. The Azure Service Management API which enables you to create your own custom administration tools as well as perform scripted management tasks from Windows PowerShell. It must be stored safely and protected resolutely. Diagnostic data may also include information about operations being performed by your customers. . deploy components or change the configuration of the system. Specifically you can use:  Azure Diagnostics to capture diagnostic data for monitoring the performance of your system. This feature enables you to access the Windows logs and other diagnostic information on these machines directly. and maintenance tasks associated with the components that comprise your system are sensitive tasks that must only be performed by specified personnel. Microsoft Systems Center Operations Manager also provides a management pack for Windows Azure. Therefore you should protect this data as it traverses the network.   Additionally.Security There are several security aspects to consider concerning the diagnostic and auditing data that you collect:  Diagnostic data is sensitive as it may contain information about the configuration of your system and the operations being performed. Depending on the nature of your system and the jurisdiction in which you organization operates. regulatory requirements may dictate that you must not delete this data or modify it in any way. Azure Diagnostics can operate in conjunction with the Enterprise Library Logging Application You should also store this data securely. Audit information forms a permanent record of the tasks performed by your system. This tool is well documented elsewhere and will not be described further in this chapter. deployment. again based on Azure Diagnostics. the monitoring. log in to the Azure Management portal at http://windows. The Azure Management portal which enables administrators to provision the resources and web sites required by your applications. You should avoid capturing any personally identifiable information about these customers. an attacker may be able to use this information to infiltrate your system. For more information.   You can configure Remote Desktop access for the roles running your applications and services in the cloud. Azure and Related Technologies Azure provides a number of useful tools and APIs that you can employ to supervise and manage hybrid applications. It also provides a means for implementing the various security roles required to protect these resources and web sites. If intercepted. by means of the same procedures and tools that you use to obtain data from computers hosted on-premises.

or other arbitrary log files. In a hybrid application. either periodically or ondemand.aspx. you should perform the transfer asynchronously so that it minimizes the impact on the response time of the code. and summarize good practice for utilizing them to support a hybrid application. see “Collecting Logging Data by Using Windows Azure Diagnostics” at http://msdn. Logging Activity. and you can arrange to transfer the diagnostic information to Azure storage. performance counters. To ensure best performance. In many cases you will likely wish to retain this data. accessing a series of well-known resources. and noting any significant events that occur. you have total control over the software deployed and configuration of each computer. it is highly scalable while attempting to minimize the performance impact that it has on the roles that are configured use it. you specify the information that you are interested in. . whether it is data from event and “How to Perform an On-Demand Transfer” at http://msdn. crash dumps. This data is lost if the compute node is reset. Monitoring such an application requires being able to transparently trap and record the various requests that flow between the components. Also. the Azure Diagnostic Monitor applies a 4GB quota to the data that it logs. depending on the volume of information being copied and the location of the Azure storage. the situation is much more the diagnostic data itself is held locally in each compute node being monitored. and you combine this data together to give you an overall view of how well the system is functioning. Transferring diagnostic data to Azure storage on-demand may take some time. It is highly configurable. You can install tools to capture data locally on each machine. information is deleted on an age basis. Monitoring Services. As a result. Azure Diagnostics is designed specifically to operate in the cloud. use a storage account hosted in the same datacenter as the compute node running the web or worker role being monitored. and the work being performed is distributed across these nodes. So how can you determine how your system is performing? This is where Azure Diagnostics prove useful.The following sections provide more information about the Azure Service Management API and Azure Diagnostics. In this environment. However. You can modify this Azure Diagnostics provides an infrastructure running on each node that enables you to gather performance and other diagnostic data about the components running on these nodes. you have a dynamic environment where the number of compute nodes (implementing web and worker roles) running instances of your services and components might vary over time. For detailed information about how to implement Azure Diagnostics and configure your applications to control the type of information that Azure Diagnostics records. trace logs. The topics “How to Schedule a Transfer” at http://msdn. if this quota is exceeded. you cannot easily install your own monitoring software to assess the performance and well-being of the elements located at each node.aspx.aspx provide information on how to perform these tasks. IIS logs. and Measuring Performance in a Hybrid Application by Using Windows Azure Diagnostics An on-premises application typically consists of a series of well-defined elements running on a fixed set of computers. but you cannot exceed the storage capacity specified for the web or worker You have much less control of the configuration of these nodes as they are managed and maintained by the datacenters in which they are hosted.

Azure storage is independent of any specific compute node and the information that it contains will not be lost if any compute node is restarted. Guidelines for Using Azure Diagnostics From a technical perspective. and it runs in the cloud alongside each web role or worker role that you wish to gather information about. See “How to Configure the TraceListener in a Windows Azure Application” at http://msdn. while crash dumps are transferred to a BLOB storage container under the path wad-crash-dumps and IIS 7. For more information.aspx. and combine this data together to give you an overall view of your system. you can pinpoint areas of concern that may impact the operations that your system implements. and you must configure the Azure Diagnostics Monitor with the address of this storage account and the appropriate access key. information from the Windows Event Logs are transferred to a table named WADWindowsEventLogsTable. Event-based logs are transferred to Azure table storage and file-based logs are copied to BLOB storage. Your organization currently uses System Center Operations Manager to monitor and maintain services running on-premises. you can also install the Monitoring Management Pack for Windows Azure. You can configure the Azure Diagnostics Monitor to determine the data that it should collect by using the Windows Azure Diagnostics configuration file. This component is called the Azure Diagnostics Monitor. hybrid solution. You can use System Center Operations Manager.aspx. This pack operates in conjunction with Azure Diagnostics on each compute node. and responsiveness of your system and the resources that it consumes. The following list suggests opportunities for incorporating Azure Diagnostics into your solutions:  You need to provide a centralized view of your system to help ensure that you meet the SLA requirements of your customers and to maintain an accurate record of resource use for billing purposes.diagnostics> section to the application configuration file of your web or worker role.wadcfg. data gathered from performance counters is copied to a table name WADPerformanceCountersTable. see “Using the Windows Azure Diagnostics Configuration File” at http://msdn. using Azure Diagnostics enables you to gather the data from multiple role instances located on distributed compute performance. The appropriate tables and BLOBs are created by the Azure Diagnostics Monitor. By analyzing this information. Azure Diagnostics is implemented as a component within the Windows Azure SDK that supports that standard diagnostic APIs.0 logs are copied to another BLOB storage container under the path You must create a storage account for holding this data. Additionally. diagnostics. and you can configure this form of tracing by adding the appropriate < an application can record custom trace information by using a trace log. see “How to Specify a Storage Account for Transfers” at http://msdn.aspx for more information. or alternatively there are an increasing number of third-party applications available on the CodePlex community site that can operate with the raw data available through Azure Diagnostics to present the information in a variety of easy-todigest ways. enabling you to record and observe the diagnostics for your . If you have deployed Systems Center Operations Manager on-premises. These tools enable you to ascertain the throughput. and also evaluate the costs associated with performing these operations. If you are building a distributed. For more information. Azure Diagnostics provides the DiagnosticMonitorTraceListener class to perform this for example.

This tool is also invaluable for assessing how the services that comprise your system are using resources. Your organization does not use Systems Center Operations if you have Visual Studio and the Windows Azure SDK. You can also download the diagnostic data from Azure storage to your servers located on-premises if you require a permanent historical record of this information. You can periodically transfer the diagnostic data to Azure storage and examine it by using a utility such as Azure Storage Explorer (from Neudesic). For more information. Additionally. you can arrange for additional role instances to be started if the response time for handling client requests is too these tools simply provide very generalized access to Azure storage. However. if appropriate. For more information about System Center Operations Manager and the Monitoring Management Pack for Windows Azure. Using Systems Center Operations Manager.aspx.  You need to provide a centralized view of your system to help monitor your application and maintain an audit record of selected operations. For example. you can configure alerts that are tripped when various measurements exceed specified thresholds. If you need to analyze the data into detail it may be necessary to build a custom dashboard application that connects to the tables and BLOBs in the storage you can connect to Azure storage and view the contents of tables and BLOBs by using Server Explorer. such as an audit log of selected operations or you wish to analyze the data offline. and SQL Server Reporting Services outputs a series of reports that provide a graphical summary showing the performance of the system. or you can send an alert to an operator who can investigate the issue.applications and services hosted in the cloud and integrate it with the data for the other elements of your solution that are running on-premises. and generate reports that show how throughput varies over time enabling you to identify any trends that may require you to allocate additional resources. Figure 1 depicts the overall structure of a solution that gathers diagnostics data from multiple nodes and analyzes it on-premises. see “How to View Diagnostic Data Stored in Windows Azure Storage” at http://msdn. aggregate the information gathered for all the nodes. .microsoft. and automate procedures for taking any necessary corrective action. You can associate tasks with these alerts. visit http://pinpoint. helping you to determine the costs that you should be recharging to your customers. The diagnostics data is reformatted and copied into tables in a SQL Server database.

You can download Cerebrata Azure Diagnostics Manager at and you can record information about these exceptions to Azure Diagnostics .com/Products/AzureDiagnosticsManager/Default.  You need to instrument your web applications and services to identify any potential bottlenecks and capture general information about the health of your It is also useful to trace and record any exceptions that occur when a web application runs. You can download Neudesic Azure Storage Explorer from the CodePlex community site at http://msdn. as well as a dashboard presenting a live view of the data. Applications running in the cloud can use the Azure Diagnostics APIs to incorporate custom logic that generates diagnostic information.cerebrata. You can also define your own custom diagnostics and performance counters.aspx provides more information.aspx. so that you can analyze the circumstances under which these exceptions arise and if necessary make adjustments to the way in which the application functions. You can make use of programmatic features such as the Microsoft Enterprise Library Exception Handling Application Block to capture and handle exceptions. enabling you to instrument your code and monitor the health of your applications by using the performance counters applicable to web and worker The topic “Tracing the Flow of Your Windows Azure Application” at http://msdn.Figure 1 Gathering diagnostic data from multiple nodes An alternative approach is to use a tool such as Azure Diagnostics Manager (from Cerebrata) which provides functionality for analyzing Azure diagnostics. and features that enable you to transfer the data to Azure storage or download it to your on-premises servers.

com/enus/library/ff664698(v=PandP. This data can then be examined by using a tool such as System Center Operations Manager with the Monitoring and Management Pack for Windows Azure. To minimize the overhead associated with logging data. ServiceDefinition. infrastructure logs. see “How to Remotely Change the Diagnostic Monitor Configuration” at see “How to: Initialize the Windows Azure Diagnostic Monitor and Configure Data Sources” at http://msdn. For more information about incorporating the Logging Application Block into an Azure solution. specify the diagnostics to collect. This is actually the default value. At these times you need gather additional detailed diagnostic data to help you to determine the cause of the but occasionally they run slowly and become unresponsive. and transfer the diagnostic data periodically to Azure crash dumps. If you need to examine data from performance counters. for each role must be set to true. . see “The Exception Handling Application Block” at see “Using Logging Application Block in Azure” at http://blogs.csdef.aspx. For more information. You need to be able to modify the configuration of the Azure Diagnostics Monitor on any compute node without stopping and restarting the node.aspx. only trace logs. You can modify this interval. The default polling interval is 1 minute. The Azure Diagnostics Monitor periodically polls its configuration information. Guidelines for Securing Azure Diagnostic Data Windows Azure Diagnostics requires that the roles being monitored run with full trust. the enableNativeCodeExecution attribute in the service definition file.aspx. and IIS logs are captured by using the Enterprise Library Logging Application Block. providing a detailed record of the exceptions raised in your application across all nodes. or other arbitrary logs and files you must enable these items explicitly.50). For more information. Windows event logs. You can also configure Azure Diagnostics for a web or worker role remotely by using the Azure SDK. For more information about using the Exception Handling Application You can follow this approach to implement a custom application running on-premises that connects to a node running a web or worker role. but you should not make it too short as you may impact the performance of the Diagnostics Monitor. and also generating alerts if any of these exceptions require the intervention of an operator. IIS failed request logs. and any changes come into effect after the polling cycle that observes them.  Most of the time your web and worker roles function as An application can dynamically modify the Azure Diagnostics configuration by using the Azure SDK from your applications and

Many of these tools and utilities also employ the Azure Service Management API. You can use this portal to upload applications.msdn. If you have built on-premises applications or scripts that can dynamically reconfigure the diagnostics configuration for any role. If these credentials are disclosed to an attacker or another user with malicious intent. You should also consider protecting all communications between the Azure storage service and your on-premises applications by using HTTPS. and any user who has this information has free rein over your entire solution. The following scenarios provide suggestions for mitigating these problems:  You need to provide controlled access to an operator to enable them to quickly perform everyday tasks such as configuring a role. Guidelines for using the Azure Service Management API and PowerShell While the Azure Management portal is a powerful application that enables an administrator to manage and configure your Azure services.The diagnostic information recorded for your system is a sensitive resource and can yield critical information about the structure and security of your system. ensure that only trusted personnel with the appropriate authorization can run these applications. You can download the Azure PowerShell Cmdlets at http://wappowershell. It also requires that the user has knowledge of the Live ID and password associated with the Azure subscription for your organization. The . Therefore. or starting and stopping role instances.codeplex. you can also manage your Windows Azure applications programmatically by using the Azure Service Management and to deploy these packages to Azure. However. You can utilize this API to build custom management applications that deploy. and manage hosted services and storage accounts here. and how to configure the security requirements of these elements. this approach enables you to quickly build scripts that administrators can run to perform common tasks for managing your applications. provisioning services. Updating. The Windows Azure SDK provides tools and utilities to enable a developer to package web and worker roles. and manage your web applications and services. You can also exercise these APIs through the Azure PowerShell Deploying. this is an interactive tool that requires a detailed understanding on the part of the user concerning the structure of your solution. where the various elements are deployed. they can easily disrupt your services and damage your business operations. and some are invoked by several of the MSBuild tasks and wizards implemented by the Azure templates in Visual Studio. you should carefully guard the storage accounts that you use to record diagnostic data and ensure that only authorized applications have access to the storage access keys. This information may be useful to an attacker attempting to penetrate your system. and Restoring Functionality by Using the Azure Service Management API and PowerShell The Azure Management portal provides the primary interface for managing Windows Azure subscription. You can download the code for a sample application that provides a client-side command line utility for managing Azure applications and services at http://code.

verified. or the filename of a package to deploy. Switching from the staging to production environment can also be scripted. scripts are not ideal for handling complex logic. it may be preferable to use the Azure Service Management API directly from a custom application running on-premises. A script that creates or updates roles should deploy these roles to the staging environment in one or more datacenters for testing prior to making them available to customers.operator should not require a detailed understanding of the structure of the application. but should only be performed once testing is complete. the less likely it is that these credentials will be disclosed to an unauthorized third party. deploying or managing your system. handling. except that the operations are more complex and potentially more error-prone. and should not be able to perform tasks other than those explicitly mandated. The disadvantage of this approach is that the scripts must be thoroughly tested. the operator does not need to be provided with the credentials for the Azure subscription. This is a classic case for using scripts incorporating the Azure PowerShell cmdlets. reducing the chances of human error on the part of the operator. . and retry logic (if appropriate). The operator should not require a detailed understanding of the structure of the application. for example. You can provide a series of scripts that perform the various tasks required. Adopting a wizard-oriented approach is easier to understand and less error-prone than expecting the operator to provide a lengthy string of parameters on the command line as is common with the scripted approach. as described in the section “Guidelines for Securing Management Access to Azure Subscriptions” later in this chapter. the security policy enforced by the Azure Service Management API requires that the account that the operator is using to run the scripts is configured with the appropriate management certificate. through a graphical user interface with full error checking. for example. Additionally. enabling the operator to specify the details of items such as the address of a role to be taken offline. especially when the same series of tasks must be performed across a set of roles and resources hosted in different datacenters. and should not be able to perform tasks other than those explicitly mandated. and maintained. The fewer operators that know the credentials necessary for accessing the Azure subscription. This scenario is an extension of the previous case. but also provisioning and managing SQL Azure databases. and you can parameterize them to enable the operator to provide any additional details. such as the filename of a package containing a web role to be deployed. This approach also enables you to control the sequence of task if the operator needs to perform a complex deployment. Scripting also provides for consistency and repeatability.  You need to provide controlled access to an operator to enable them to quickly perform a potentially complex series of tasks for configuring. inadvertently or otherwise. such as performing error handling and graceful recovery. Instead. involving not just uploading web and worker roles. To run these scripts. or the address of a role instance to be taken offline. In this scenario. You can also make the application more interactive. This application can incorporate a high degree of error detection.

by enforcing mutual authentication using management certificates over the application can authenticate the user. all service management requests are actually transmitted as HTTP REST you are not restricted to using Windows applications for building custom management applications.cer file without the private key and upload this file to the Management Certificates store by using the Azure Management portal. Therefore. such an application cannot easily be used to perform automated routine tasks scheduled to occur at off-peak hours. In this case. For more information about creating management certificates. and it must be stored in a secure location. A management certificate must have a key length of at least 2048 bits and must be installed in the Personal certificate store of the account running the application. you can use any programming language or environment that is capable of sending HTTP REST requests.aspx. it must provide the key to a management certificate installed on the computer running the application as part of the request. Of course. This audit trail provides a full maintenance and management history of the system. You should export the certificate from the on-premises computer as a . such as notifying an operator. keep the features exposed simple to use. you need to recover from this failure quickly.  You are using Systems Center Operations Manager to monitor the health of your However. and implement meaningful and intelligent default values for items that users must select. you should avoid attempting to make the application too complex. or invoking a script. see “Windows Azure Service Management REST API Reference” at http://msdn. If a failure is detected in one or more elements. see the topic “How to: Manage Management Certificates in Windows Azure” at http://msdn.A custom application also enables you to partition the tasks that can be performed by different operators or roles. . For more information. System Center Operations Manager can trip an alert when a significant event occurs or a performance measure exceeds a specified threshold. When an on-premises application submits a management request. being interactive. You can exploit this feature to detect the failure of a component in your system and run a PowerShell script that attempts to restart it. the scripted approach or a solution based on a console-mode command-line application may be more appropriate. The same certificate must also be available in the Management Certificates store in the Azure subscription that manages the web applications and services. Guidelines for Securing Management Access to Azure Subscriptions The Azure Service Management API ensures that only authorized applications can perform management operations. A custom application should audit all operations performed by each user.aspx. You can respond to this alert in a variety of ways. and only enable the features and operations relevant to the identity of the user or the role that they belong to. the disadvantage of this approach is that. The Azure Service Management API is actually a wrapper around a REST interface. This certificate must also include the private key.

both at initial deployment and while the application is running. The following sections describe how Trey Research implements diagnostic logging. and to deploy these packages to Azure. and how the trace information is downloaded from the cloud to their on-premises servers. examine the diagnostic data. generate the appropriate usage reports. . you should either refrain from provisioning your developers with the management certificates necessary for accessing your Azure subscription. or you should retain a separate Azure subscription (for development purposes) with its own set of management certificates if your developers need to test their own code in the cloud. you should be wary of letting developers upload new versions of code to your production site. You can download the example code by following the links at http://wag. this solution does not preclude Trey Research from deploying System Center Operations Manager in the future.codeplex. in table storage at each Periodically transferring the diagnostic data to a secure on-premises location. and then reformatting this data for use by their own custom reporting and analytical tools. which might be a requirement as ever-stricter compliance regulations become legally binding. For this reason. this is a task that must be performed in a controlled manner and only after the code has been thoroughly tested. and alert an operator if an instance failed or some other significant event occurred. How Trey Research Monitors and Manages the Orders Application The Trey Research example application that is available in conjunction with this guide demonstrates many of the scenarios and implementations described in this chapter. Trey Research considered the following two options for monitoring this data and using it to manage the system:  Using System Center Operations Manager with the Azure Management Pack to connect directly to each datacenter. Logging Data to Monitor the Application Trey Research traces the execution of each role instance. Additionally. The final sections of this chapter describes the approach that Trey Research took for managing and controlling the entire system. However. However. These tools also require you to specify a management certificate. and also records the details of any exceptions raided by the role instances using a combination of a custom TraceHelper class and the standard Windows Azure Diagnostics mechanism. now would be a good time to download the example so that you can examine the code it contains as you read through this section of the guide. The data is initially stored in a table named WADLogsTable. If you have not already done so.Remember that the Windows Azure SDK includes tools that enable a developer to package web and worker roles. the existing investment that Trey Research had already made in developing and procuring custom analytical tools led to the second option being more appealing. it meant that Trey Research could more easily retain a complete audit log of all significant operations and events locally.  Although System Center Operations Manager provides many powerful features.

Windows Event Log entries are transferred to a table named WADWindowsEventLogsTable. Trey Research only collects trace messages that have a severity of Warning or higher. However.Selecting the Data and Events to Record Trey Research decided to record a series of different types of events using trace messages and Windows Azure Diagnostics. Various events related to an order being placed. Notice that Trey Research does not collect Windows Event Log events or Windows Performance Counter data. and Performance Counter data is transferred to a table named WADPerformanceCountersTable. Performance Counters implemented by Windows operating system and installed applications. the mechanism Trey Research implemented allows administrators to change the behavior to provide more detail when debugging the application or monitoring specific activity. Windows internal operating system or software events. If Trey Research wanted to capture this data in the on-premises management application. Application-defined Warning-level events Collected by the custom TraceHelper class Posting a message to a Topic or Queue fails but will be retried. or customer data being added. To collect Windows Event Logs and Performance Counter data you must configure Azure Diagnostics to transfer the required data to Table Storage. A job process task in a Worker Role fails. Updating a database fails but will be retried. . Instead. available for future expansion. its developers would need to write additional code to download the data in these tables. Trey Research captures information at all stages of the operation of the application and sends these to the Windows Azure Diagnostics mechanism through a custom class named TraceHelper. Not collected by Trey Research. Application-defined Verbose-level events Windows Event Logs Windows Performance Counters Collected by the custom TraceHelper class Not collected by Trey Research. Starting a job process task in a Worker Role. Failure to access Windows Azure Cache. Opening a view in the Orders website. Event or trace type Application-defined Error-level events Logging mechanism used Collected by the custom TraceHelper class Type of event that initiates logging Posting a message to a Topic or Queue fails after all retries. The following table shows the logging configuration Trey Research uses. Detailed information on a Web Role's interaction with Windows Azure Cache. None defined. Application-defined Information-level events Collected by the custom TraceHelper class Application startup. Under normal operation.

Diagnostics . // Start the monitor with the modified configuration.DiagnosticMonitorTraceListener. . config.Undefined.Start("DiagnosticsConnectionString".diagnostics> <sources> <source name="TraceSource"> <listeners> <add type="Microsoft.ScheduledTransferPeriod = TimeSpan. C# private static void StartDiagnostics() { // Get default initial configuration.FromSeconds(60)....DataSources.. config.0" encoding="utf-8"?> <configuration> . .Logs.WindowsAzure. // Update the initial configuration." name="AzureDiagnostics"> <filter type="" /> </add> </listeners> </source> </sources> </system. <system. The configuration file for both applications contains the following settings: XML <?xml version="1. config.Logs. var config = DiagnosticMonitor.diagnostics> </configuration> This configuration defines a diagnostic source listener named TraceSource that sends trace messages to the Azure DiagnosticsMonitorTraceListener class.Website project are both configured to use Azure Diagnostics.GetDefaultInitialConfiguration().Add( "Application!*").Configuring the Diagnostics Mechanism The worker role in the Orders.Workers project and the web role in the Orders.WindowsEventLog. DiagnosticMonitor.ScheduledTransferLogLevelFilter = LogLevel. The worker role and web role classes both initiate the diagnostics mechanism by calling a method named StartDiagnostics from the role's OnStart method. There is no filter level defined in the configuration because this will be set in the code that initializes the TraceSource listener. // // // // Following could be used to enable transfer of events from Windows Application Event Log to the WADWindowsEventLogsTable.

such as Windows Event Log messages or Windows Performance Counters. and specifies the connection string for the storage account containing the target WADLogsTable table. The TraceHelper class instantiates a TraceSource instance and exposes methods that make it easier to write trace messages with different severity levels. The parameter named DiagnosticsConnectionString is configured in the service configuration file for the solution (ServiceConfiguration. it could update the DiagnosticsMonitor configuration by adding additional data sources to it so that it transfers other types of information. the custom TraceHelper class has a severity switch setting that ensures it only writes events with a severity greater than or equal to the severity specified in the configuration setting in the ServiceConfiguration. If required. } The StartDiagnostics method does the following:   It reads the current configuration of the DiagnosticsMonitor instance. This is the default period.    Trey Research sets the scheduled transfer log level filter to LogLevel.cscfg file.Information). It updates the DiagnosticsMonitor configuration so that it will transfer all traces and event messages to the WADLogsTable. but the value could be read from configuration so that it is easier to change after the application has been deployed. During testing and debugging. You use the LogLevel enumeration to limit the severity of event and trace messages that are transferred to the WADLogsTable. Implementing Trace Message Logging and Specifying the Level of Detail Trey Research collects trace messages generated by a custom class named TraceHelper (located in the Helpers folder of the Orders.config). It starts the DiagnosticsMonitor instance with the updated configuration. . Note that this would require additional code to download the data from these tables (such as WADWindowsEventLogsTable and WADPerformanceCountersTable) to the on-premises management application. You can also change the scheduled transfer period interval to a shorter value so that messages appear in the WADLogsTable more quickly. C# public class TraceHelper { private static readonly TraceSource Trace.cscfg). to the specific tables that store this data. you can change this setting to collect and transfer to the WADLogsTable all trace messages. static TraceHelper() { Trace = new TraceSource("TraceSource". It updates the DiagnosticsMonitor configuration so that it will transfer the selected events and messages every minute.Undefined so that all trace messages will be transferred. SourceLevels.Shared project).

LinkDemand.TraceEvent(TraceEventType. } public static void TraceWarning(string format. this setting can be changed by calling the Configure method it exposes.Verbose. C# public override bool OnStart() { . 0.TraceEvent(TraceEventType.Error. However. args)..TraceEvent(TraceEventType. params object[] args) { Trace. } public static void TraceError(string format.Information. args). and subsequently into the WADLogsTable.GetConfigurationSettingValue( "TraceEventTypeFilter")). params object[] args) { Trace. } public static void TraceVerbose(string format. 0.. format. format. . and supplying a value for the severity level of messages to trace. } } Data recorded in this way will be directed to the Azure diagnostics monitor trace listener (due to the configuration shown earlier). params object[] args) { Trace. params object[] args) { Trace. The Worker Roles and Web Roles both configure this setting in the OnStart method by reading it from the service configuration file. ConfigureTraceListener( RoleEnvironment. 0. format. Unrestricted = true)] public static void Configure(SourceLevels sourceLevels) { Trace. args). format. .Switch.Level = sourceLevels.. } public static void TraceInformation(string format.Warning. By default the TraceHelper will capture messages with the severity filter level of Information. 0.} [EnvironmentPermissionAttribute( SecurityAction.TraceEvent(TraceEventType.. args).

Configure(sourceLevels).TraceInformation( "Executing Action '{0}'.Exception.Exception != null) { TraceHelper. this. out sourceLevels)) { TraceHelper.Shared project.. exceptions are captured using code such as that shown in the following example taken from the ReceiveNextMessage method of the ServiceBusReceiverHandler class in the Orders. For example. } . This enables administrators to change the severity filter level to obtain additional information for debugging and monitoring while the application is running..TryParse(traceEventTypeFilter. } } The roles also set up handlers for the RoleEnvironmentChanging and RoleEnvironmentChanged events that reconfigure the TraceHelper instance for the role when the configuration changes. Note that this code calls the TraceError method of the TraceHelper class to write a trace message with severity “Error”.Message).Exception.Website project defines a custom attribute called LogActionAttribute that calls the TraceInformation method of the TraceHelper class to write a trace message with severity “Information”. C# public class LogActionAttribute : ActionFilterAttribute { public override void OnActionExecuting( ActionExecutingContext filterContext) { TraceHelper. } The TraceHelper class id also used in the web role. true.. Code in the CustomAttributes folder of the Orders. errors. from controller '{1}'".ReceiveNextMessage(cancellationToken). if (Enum. . throw taskResult. and other significant occurrences. if (taskResult.} private static void ConfigureTraceListener( string traceEventTypeFilter) { SourceLevels sourceLevels. Writing Trace Messages The web and worker roles use the TraceHelper class to record information about events. C# private void ReceiveNextMessage( CancellationToken cancellationToken) { .TraceError(taskResult..

} } This feature enables the application to generate a complete record of all tagged actions performed on behalf of every user simply by changing the TraceEventTypeFilter setting in the ServiceConfiguration.ControllerName). which retrieves products for display. } public ActionResult Details(int id) { var p = this. ControllerDescriptor.FindAll().Website project are tagged with this attribute.productStore.ControllerName).codeplex. The following code shows the StoreController class. Windows Azure provides classes that make it easy to interact with Azure storage through the management API using the .. filterContext.cscfg file to Information. filterContext.ActionDescriptor. public ActionResult Index() { var products = this.ActionDescriptor. Transferring Diagnostics Data from the Cloud Trey Research uses a custom mechanism for collating and analyzing diagnostics information. Trey Research could use a third-party tool to download the data from the WADLogsTable.ActionName.. from controller '{1}' has been executed".FindOne(id). or write scripts that use a library such as the Windows Azure PowerShell Cmdlets (see http://wappowershell.filterContext. return View(products). ControllerDescriptor.ActionName.ActionDescriptor. } } } The controller classes in the Orders. This is the approach that Trey Research chose. } public override void OnActionExecuted( ActionExecutedContext filterContext) { TraceHelper.productStore. It requires that all applications store event and trace messages in an on-premises database named DiagnosticsLog that the monitoring and analysis mechanism queries at preset intervals.ActionDescriptor.NET Framework. . filterContext. return View(p). C# [LogAction] public class StoreController : Controller { .TraceInformation( "Action '{0}'.

Trim())).InvariantCulture.InvariantCulture.Split('.{0}".Count().. . .AppSettings["dataCenters"] .'). var dataCenters = WebConfigurationManager. and then calls a method named TransferLogs to download the data from this datacenter.IsNullOrEmpty(dc.GetValue("deleteEntries") != null. C# [HttpPost] public ActionResult TransferLogs( FormCollection formCollection) { var deleteEntries = formCollection.Select( dc => string. var accountNames = dataCenters2.Trim()). var dataCenters2 = dataCenters. // Get account details for accessing each datacenter. The DiagnosticsController class uses the Enterprise Library Transient Fault Handling Block (which you have seen used in previous chapters of this guide) to retry any failed connection to Azure storage and the on-premises database. dc)). for (var i = 0.Select( dc => dc. the TransferLogs action is executed. As the code iterates over the list of datacenters it creates a suitable CloudStorageAccount instance using the credentials collected earlier.Select( dc => string.. When an administrator opens the Diagnostics page of the HeadOffice application.GetDefaultAzureStorageRetryPolicy(). This action reads from configuration a list of the datacenters from which it will download data. The code that interacts with Azure storage and updates the on-premises DiagnosticsLog database table is in the DiagnosticsController class.Format(CultureInfo. located in the Controllers folder of the HeadOffice project.The on-premises monitoring and management application (implemented in the HeadOffice project of the example) contains a page that administrators use to download and examine the diagnostics data collected in the Orders application. var accountKeys = dataCenters2. the data is only downloaded when you open the Diagnostics page of the HeadOffice application.storageRetryPolicy = RetryPolicyFactory. The constructor of the DiagnosticsController class reads the retry policy from the application configuration file. and then reads the corresponding account details from configuration for each datacenter. "diagnosticsStorageAccountKey.Where(dc => !string. For simplicity in the example application.{0}". In a real application the diagnostics data would be downloaded and stored in the on-premises database automatically at preset intervals by a Windows Service or other background application. dc)). C# this. i < dataCenters2. i++) { // Create credentials for this datacenter. "diagnosticsStorageAccountName.Format(CultureInfo.

CreateCloudTableClient().ExecuteAction(() => context. bool deleteWADLogsTableEntries) { var tableStorage = storageAccount.RowKey. storageAccount. } The TransferLogs method uses the CreateCloudTableClient class to access Azure Table storage.Timestamp . RowKey = logEntry. foreach (var logEntry in query) { var diagLog = new DiagnosticsLog { Id = Guid. true). var context = tableStorage.CreateQuery<WadLog>("WADLogsTable")).RoleInstance.AppSettings[ accountNames.NoTracking. Notice how it can also delete the entries in the WADLogsTable in Azure storage at the same time to reduce storage requirements in the cloud.Role. } .ElementAt(i).ElementAt(i)]). TimeStamp = logEntry.var cred = new StorageCredentialsAccountAndKey( WebConfigurationManager..ElementAt(i)].storageRetryPolicy. } IQueryable<WadLog> query = this. this.NewGuid()..AppSettings[ accountKeys. deleteEntries). if (!deleteWADLogsTableEntries) { context. PartitionKey = logEntry. The code accesses the table service context and generates a query over the WADLogsTable in Azure storage. // Download the data from this datacenter. Message = logEntry. RoleInstance = logEntry.Message.TransferLogs(dataCenters2.MergeOption = MergeOption.GetDataServiceContext(). var storageAccount = new CloudStorageAccount(cred. DeploymentId = logEntry. CloudStorageAccount storageAccount.DeploymentId. DataCenter = dataCenter.PartitionKey. Role = logEntry. WebConfigurationManager. C# private void TransferLogs(string dataCenter. For each entry returned from the query (each row in the table) it creates a new DiagnosticsLog instance and saves this in the DiagnosticsLog database by using the DiagnosticsLogStore repository class.

The setup program can be executed to set many of the configuration options in Windows Azure instead of using the Azure web portal. Trey Research requires that the configuration of features such as ACS and Service Bus to be automated and reliably repeatable.DeleteObject(logEntry).store. this.ExecuteAction(() => context. this.Save(diagLog). These items are different for each datacenter in which the application is deployed. The Service Bus configuration Trey Research uses depends on ACS to authenticate users that will post messages to. and includes configuring services in multiple datacenters where the application will be deployed. In particular. but you can examine the source code and modify it if you wish. The setup program Trey Research created instantiates a ServiceManagementWrapper instance and then calls several separate functions within the setup program to configure ACS and Service Bus for the Orders application.storageRetryPolicy. and you can reuse this in your own applications. This configuration is necessarily complex. Using the Management API Trey Research uses a library of functions that was originally developed by the Windows Azure team to help automate configuration of Windows Azure namespaces through the REST-based Management API. user names.}.ServiceManagementWrapper project of the sample code. Trey Research created a setup program that uses the Windows Azure Management API through built-in objects and a separate library. and exposes a great deal of functionality that makes most tasks of configuring Windows Azure much easier that writing your own bespoke code.SaveChanges()). It is quite a complex project. The library code is included in the ACS. and keys in the source files. Trey Research also uses a PowerShell script that is executed before deployment to set the appropriate values for namespaces. The workings of the Service Management Wrapper library are not described here. C# internal static void Main(string[] args) { try { . } } } Configuring the Initial Deployment and Managing the Application Trey Research wants to be able to configure the namespaces within its Windows Azure account for all of the services used by the Orders application. or subscribe to. passwords. if (deleteWADLogsTableEntries) { context. Queues and Topics.

acswrapper.var acs = new ServiceManagementWrapper( acsServiceNamespace..Name != "AccessControlManagement") { acsWrapper. C# acswrapper. subscriptions and Queue. If not. which could cause an error. SetupServiceBusTopicAndQueue(). . CreateRelyingPartysWithRules(acs). acsUsername. } The values used by the setup program for namespace names. C# var rps = acsWrapper.config file. account IDs. } catch (Exception ex) { .ReadKey()..AddServiceIdentity(manInVanDisplayName. // Create Service Bus Topic. the following code from CleanupRelyingParties method of the setup program removes all existing relying parties with the name “AccessControlManagement” from the current ACS namespace. The values used in this code come from the App.RemoveRelyingParty(rp. passwords.. display exception information . and then adding a new one with the specified name and password. ACS may attempt to add duplicates settings or features such as identity providers or rule sets. Notice that the code first cleans up the current ACS namespace by removing any existing settings so that the new settings replace the old ones. } Console.RemoveServiceIdentity(manInVanDisplayName). Configuring Azure by Using the Service Management Wrapper Library To show how easy the Service Management Wrapper library is to use. manInVanPassword).Name). Console.. acsPassword).config file of the setup program project named TreyResearch. foreach (var rp in rps) { if (rp. // ACS namespace setup for the Orders Website CleanupIdenityProviders(acs).RetrieveRelyingParties().Setup.WriteLine("Setting up ACS namespace:" + acsServiceNamespace). } } The setup program creates service identities by first removing any identity with the same name. CleanupRelyingParties(acs). CreateIdentityProviders(acs). and keys are stored in the App.

serviceClient.WindowsLiveId. C# var identityProviderName = SocialIdentityProviders. // pass nameidentifier as name acsWrapper. To do this it uses the classes in the Microsoft. The code then adds the rule group to the OrdersStatisticsService relying party. the following code creates a pass-through rule for the Windows Live ID identity provider that maps the NameIdentifier claim provided by Windows Live ID to a new Name claim. You can see that is sets ACS as the claim issuer. Configuring Azure by Using the Built-in Management Objects It is also possible to use the built-in objects that are part of the Windows Azure SDK to configure your Windows Azure namespaces. The first rule allows the authenticated identity externaldataanalyzer (a small and simple demonstration program that can display order statistics) to send requests. .DeleteRuleGroupByNameIfExists( "Rule group for OrdersStatisticsService").Batch). ClaimTypes.ManagementService serviceClient = ManagementServiceHelper. defaultKey). The following code shows how the setup program creates a rule group for the Orders Statistics Service.Name.ServiceBus.GetIssuerByName( "LOCAL AUTHORITY").Name = "Rule group for OrdersStatisticsService". serviceClient. // Equivalent to selecting "Access Control Service" as // the input claim issuer in the web portal.RuleGroup().The setup program also uses the Service Management Wrapper library to create rules that map claims from identity providers to the claims required by the Orders application.ServiceBus.SaveChanges(SaveChangesOptions. Microsoft.RuleGroup.AccessControlManagement.AddToRuleGroups(ruleGroup). For example. For example. serviceClient.AccessControlExtensions . the setup program must configure the access rules for the Service Bus endpoints in ACS.Name). ClaimTypes.AccessControlExtensions.CreateManagementServiceClient( settings). and saves all the changes. ruleGroup.AccessControlManagement. var ruleGroup = new Microsoft. The second rule allows the authenticated identity headoffice (the onpremises management and monitoring application) to listen for requests.AccessControlManagement namespace.DisplayName.AddPassThroughRuleToRuleGroup( defaultRuleGroup.AccessControlExtensions . identityProviderName. and adds two rules to the rule group. var issuer = serviceClient. C# AccessControlSettings settings = new AccessControlSettings(serviceBusNamespace.ServiceBus.NameIdentifier.

windows. manipulating storage services.AddLink(relyingParty.CreateRule( issuer. downloading logging data. setting firewall rules.Add( relyingPartyRuleGroup). and interacting with SQL Azure. claims/nameidentifier". string. and reuse some of the generic routines it contains. "net. relyingParty. There is a great deal of code in the TreyResearch. "RelyingPartyRuleGroups". You have seen in the previous sections of this chapter how you can create setup programs to perform these kinds of tasks. adding and removing certificates. "Listen". "RelyingPartyRuleGroups". "headoffice". There are some tasks that you perform only occasionally. Using PowerShell Cmdlets Managing Windows Azure applications and services typically involves two types of tasks.Empty).servicebus. "http://schemas. .Batch). Typical examples are deploying and updating services. var relyingPartyRuleGroup = new Microsoft. ruleGroup. var relyingParty = serviceClient. "externaldataanalyzer".AddToRelyingPartyRuleGroups( relyingPartyRuleGroup). serviceClient. such as configuring namespaces and features such as ACS and Service Bus.xmlsoap. true). You may find it useful to examine this project when you create your own setup programs.AccessControlExtensions . serviceClient.CreateRule( claims/nameidentifier". ruleGroup.ServiceBus.Empty). "http://schemas.Setup project.RelyingPartyRuleGroups.servicebus.AddLink(ruleGroup.GetRelyingPartyByName( "OrdersStatisticsService".serviceClient. The second types of tasks are often less complex operations but must be executed much more often. relyingPartyRuleGroup).RelyingPartyRuleGroup().action". "Send". serviceClient. serviceClient. "net.action".xmlsoap.AccessControlManagement. serviceClient. relyingPartyRuleGroup).windows.SaveChanges(SaveChangesOptions.

com/enus/library/windowsazure/gg433048. enabling you to build scripts and applications that an operator can use to deploy and manage the elements that comprise your application. Following initial testing. and validation of the Orders application. but also has a thorough familiarity with the way in which this architecture maps to the services provided by the Azure platform. More Information      “Collecting Logging Data by Using Windows Azure Diagnostics” at http://msdn.aspx “Monitoring Windows Azure Applications” at Summary This chapter has described the issues surrounding configuring. For more information about the CmdLets.aspx "Take Control of Logging and Tracing in Windows Azure" at http://msdn.aspx “Windows Azure Service Management REST API Reference” at http://msdn. but it is important that you understand how to relate the diagnostic information that is generated to the structure of your application. reliable.codeplex. This approach eliminates the need for an operator to have the same level of expertise with Azure as the solution architect.For many of these tasks you can use PowerShell Cmdlets directly from the command line. monitoring and managing a hybrid application.aspx "Windows Azure PowerShell Cmdlets" at http://wappowershell. see “Windows Azure PowerShell Cmdlets” at http://wappowershell. and repeatable The Windows Azure Cmdlets library that you can download from the Codeplex website contains almost one hundred Cmdlets that can accomplish almost any Windows Azure management and configuration task. or (even better) within scripts to provide a repeatable. The key to maintaining a good level of service is detecting these issues and failures. and can also reduce the scope for errors by automating and sequencing many of the tasks involved in a complex deployment. Windows Azure Diagnostics provides the basic tools to enable you to detect and determine the possible causes or errors and performance problems. as well as accomplishing any other management-related tasks that are required when the application is running in the production environment. The Azure Service Management API provides controlled access to the Azure platform. The important point to realize is that the complexity of the environment and its distributed nature mean that it is inevitable that performance issues and failures will occur in such a system. . and automated process. Trey Research intends to use the Windows Azure Cmdlets to deploy and update and responding quickly in a Analyzing this information is a task for a domain expert who not only understands the architecture and business operations of your application.

. "Introduction" of this guide.For links and information about the technologies discussed in this chapter see Chapter 1.

Sign up to vote on this title
UsefulNot useful