Security and Utility Computing: The SAVVIS Way

By: Dr. Bill Hancock CISM, CISSP. ISSAP, ISSMP Vice President – Global Security Solutions Chief Security Officer SAVVIS, Inc.

V1.9 10-24-05 WMH Copyright © 2005 by SAVVIS, Inc. All Rights Reserved

1

Purpose of Document This document describes security aspects of the SAVVIS implementation of the utility computing platform (UCP) as designed, deployed and managed by SAVVIS, Inc.

Executive Summary Utility Computing Platform (UCP) facilities designed and provisioned by SAVVIS generally have greater security components, controls and management compared to traditional do-it-yourself rack-and-stack server configurations typically used in most server deployments in IT and Internet environments. This is largely due to additional security components (such as a secure production management network) and management methods *such as automatic server “hardening” activities on base configuration) that are part and parcel to the design, deployment and daily management of the UCP environment. What is automatically provisioned and deployed, security-wise, in the UCP environment by default is NOT the default configuration of rack-and-stack server environments. Further, the additional security components added to the UCP environment, if the equivalent security components were added to a traditional rack-and-stack server configuration, would substantially add to the deployment time and overall expense of management of the rack-and-stack server environment. What security components that are built-in and automatic in nature as part of UCP architecture and management would easily double the management opex and capex costs of an equivalent traditional rackand-stack server configuration. Not only is a UCP computing environment more secure in nature, it is done so at a much lesser cost than traditional managed rack-and-stack secure server configurations. An important and often overlooked aspect of UCP computing is the inclusion of “soft” services that are not typically provisioned in rack-and-stack configurations and which are typically expensive to provision when added to an installation. SAVVIS includes these services as part of UCP to ensure a constant, consistent and professionallymanaged security experience.

What is the Utility Computing Platform? A Review Utility Computing Platform (UCP) is a term applied to a cohesive set of IT server-centric components which provide for computing, storage, networking and security services that are functionally equivalent to traditional server boxes deployed in data centers in data racks, stacked in 1-to-4U high rack mounted boxes.

2

In a traditional rack-and-stack environment, a server typically contains a physical box with a processing motherboard, CPU, memory, I/O interfaces, one or more internally mounted or accessorized disk drives, network interface, cables, power supply and all manner of parts required to mount the configuration in a rack in a data center. Network security is traditionally provisioned by external firewall and intrusion detection system (IDS) boxes. Virtual private networking (VPN) services are provisioned by yet again additional boxes. Network connectivity is provided by network services from the data center provider and may include router, switch or other means of network connectivity. In a typical installation, the owner installs and maintains the operating system and applications in addition to provisioning of personnel and expertise for daily server management, security management and routine maintenance in a server environment. Servers are used for multiple and varied applications from email to file storage, web sites, financial applications, databases. Out-of-the-box server operating environments are configured to run with most applications and are not specifically configured to deal with the security aspects of a particular application type or different types of access control. As there are a myriad of security controls associated with almost any operating environment and most are difficult to understand and configure, most servers are deployed with minimal, if any, security due diligence in operating system configuration and are often running in commonly accessible “mostly open” configurations which are highly subject to a variety of cyber attacks. There are significant drawbacks, operationally, to a rack-and-stack server configuration. Most notably, as the configuration becomes more and more loaded, additional RAM, disk space and processing power are required. This may entail upgrades to the physical box, motherboard, RAM and mounted disks on the server rack(s). Additional network bandwidth may be required to support network load requirements and this may involve multiple network interface cards and complex network configurations. As network connectivity is added, firewall and IDS configurations are also expanded and changed to accommodate the new network connectivity requirements. Direct and indirect capacity management activities, and the subsequent loss of environment configuration management, are frequently cited in rootcause analysis involving security breakdowns. Over time, an initially compact rack-and-stack configuration can rapidly grow to a multiple rack, many stack environment that is not only difficult to manage and maintain, but becomes increasing fragile as more and more parts are added to the configuration which can break and take down the entire configuration. As the server configuration is expanded, it is eventually split up into multiple servers, which adds management complexity. The additional server configurations also require substantially more time in configuration and deployment (the average time to deploy a simple web server, for instance, is an industry standard three weeks, minimum to get the boxes, deploy and install the racks and stacks, provision network connectivity, configure external components and get the system up and running – and then, you can start to worry about applications and data). Complexity of physical configuration adds a level of 3

fragility to the configuration and complexity of management coordination between the components in the configuration. Expansion of a traditional server environment is an expensive, time-consuming task that provides little return on investment. All time and effort to manage the environment are cost-overhead recurrences that become increasingly costly over time, as the server configuration grows. Eventually, the capex and opex expense of operating a server “farm” become very serious, as does its management. Expansion costs become a major factor of accommodating additional use of the servers, leasing to yet newer server configurations, migrations, cut-overs and all manner of management issues and failure. Utility Computing Platforms (UCPs) were designed to address issues of server deployment, installation, management and expansion. The goal is to simplify configuration and allow for very rapid expansion at minimum capex and opex costing while maintaining a very high degree of reliability, maintainability and resiliency. UCPs also allow for extremely rapid provisioning of new servers with far fewer of the headaches of traditional rack-and-stack configuration issues traditionally faced by IT managers. A base UCP consists of four major components: compute component, storage, network interfaces and security controls. Each of the components are built on a multiple “blade” architecture that allows each component to be used and shared by more than one server, which allows the operator to share the resource (and associated components and infrastructure) and reduce overall costs. In the most basic of configurations, a multiple-blade CPU environment, the compute server component, is provisioned with a storage array server. At SAVVIS, the CPU compute server of choice is a rack mountable multiple blade processing engine. Customers who require server processing power can allocate one or more blades to the server to provision whatever processing horsepower may be required for the server configuration. Customer configurations can start with a single blade for a single server and can grow to the use of many blades for a single application as computational demands grow. A differentiating UCP attribute is that as additional horsepower is required as the server grows, additional blade processing power is added via software commands. In a traditional rack-and-stack server environment, increasing CPU power would involve either replacement of the server’s motherboard to a multiprocessor motherboard (lots of configuration and downtime issues) or additional server boxes being deployed into a complex clustered environment, which also imposes downtime and complex post-configuration management. With a server based on virtual computing “blades,” additional horsepower is allocated by adding multiple blades into the configuration and re-booting the configuration. Downtime is minimal and a serious upgrade in processing horsepower can be accomplished in a few minutes.

4

The next UCP component is storage. Most technologists and home computer users have been through the “disk is full” experience and in upgrading the disk environment for more storage. Upgrades are often operationally intensive as disks typically have to be backed up, the old disks removed and new ones installed, formatted and data restored – and that’s if the disk is compatible with the operating system as formatted. In many cases, larger drives require a system re-install in essence requiring a complete system rebuild. In high activity server environments, multiple disks are often provisioned in external boxes called disk arrays and are configured to work with a server where they can be quickly upgraded and additional disk drive units added over time, helping to simplify the overall expansion process. UCPs typically provide a very large storage array facility that usually starts with at least two terabytes of on-line storage capacity. Modern storage arrays can contain petabytes of online storage, which can be allocated to one or more computational servers that share the storage array(s). As with the computational server expansion model, mass storage array allocation can be performed with a software command, adding on whatever additional storage is required for a server in minutes. This is especially useful in a server environment where the amount of actual storage required is initially unknown, but where the storage requirements grow rapidly or unexpectedly. Again, instead of the painful “let’s swap the disk” experience, storage is added to the server configuration via software commands and the server gains the additional needed storage capacity. A server with no network access is a worthless system in today’s environments. Working with the compute component of the server hierarchy, network switches and routers are added to the mix, which provide network connectivity to the server for remote users and applications to access the server. Unlike traditional network interface cards, however, UCP networking is also accomplished via software commands where the various network components are instructed on network interface types to provide (100mbs Ethernet, gigE Ethernet or other types of network connectivity), traffic routing and filtering rules, DNS facility provisioning, IP address provisioning and all other manner of network minutiae that is required to get the server configuration “talking” to a networked environment and accessible to remote users and applications. Via virtual networking, optimal routing, line speeds and network capacity can be allocated and deployed to the server without the inconvenience of traditional network provisioning efforts, installation issues, cable configurations and the myriad of issues involved in network provisioning and expansion of server network access over time. The final component to UCP provisioning is network security controls. In traditional server environments, network security controls are provisioned by standalone firewall, intrusion detection system (IDS) and virtual private network (VPN) boxes that are installed “in front” of the server on its network connection. Installation of these components in a traditional rack-and-stack environment can take weeks as rack space must be allocated, power provisioned, network connectivity provisioned, boxes installed, 5

OS and security application software installed and then the security “rule base” and other security components installed and provisioned to protect the server. If the server grows in capacity, so must the security components. If the server “splits” into multiple servers, the security components must be multiplied and deployed for the new server configurations – which can be complex and ugly to manage. Additionally, management of the security components of a server involve logging, event analysis, rule base adjustments, upgrades at all times and a host of operational functions which are multiplied as multiple instantiations of the security components are installed over time. It’s a complex, time-consuming mess to deal with. In UCP environments deployed at SAVVIS, virtual security components, similar in nature to the previous virtual platforms, are also included in UCP configuration(s). Through the use of a multi-service security virtual services system, and some clever traffic routing techniques, firewalls, IDS and VPN connectivity can be virtually provisioned via software commands – just like compute, storage and networking facilities. As a result, configuration of firewall facilities takes far less time. VPNs can be provided in minutes and IDS facilities similarly provisioned. As additional network bandwidth and computational power are added to the server, the security components can also allocate additional CPU and network horsepower to their configurations to deal with the additional traffic loads that the server incurs as it expands over time. As can be seen, utility computing platform components can go a long way to solving the issues of traditional rack-and-stack computing, storage, networking and security needs as the environment is provisioned and expands over time. During the traditional server life cycle, UCP facilities can do all the required work of the server environment and often at significant savings – much less in costing than traditional rack-and-stack configurations. The most compelling reason for UCP provisioning is speed of deployment and speed of expansion with minimal downtime. Even if all costs were equal, UCP components allow very, very rapid provisioning and expansion of server environments in the order of minutes instead of days, weeks and months. Additionally, UCP facilities incur less downtime from a traditional expansion perspective, irrelevant to the built-in recovery and resiliency facilities that UCP offers by way of its normal modes of operation (see more on this later in this paper).

Security: A Server’s Issues Security, from an end-to-end perspective, is divided into three separate and distinct provisioning and management planes in an operational environment: 1. Infrastructure. These are the networking components that are used to move data between all interconnected entities. Security components typically involve 6

firewalls, IDS, intrusion prevention systems (IPS), DDoS mitigation facilities, DHCP security, DNS security, router and switch Access Control Lists and component security, patching of network components and other related efforts. If infrastructure security is not addressed to keep the infrastructure healthy, system problems in system operation occur. Similarly, if there are security vulnerabilities in this infrastructure, it does not matter how secure the rest of the components are as nothing is going to reach them. Infrastructure security is a combination of IT provisioning, network vendor services and the co-operation of both to achieve a balanced, secure networking environment. 2. Server Security. The most lucrative attack point in today’s computing environments is at the server level. Servers are the central points of control and information creation, access to which can be bought, sold and manipulated for personal or criminal gain. Servers provide mass repositories for information used to power companies, governments and mixed environments. Servers are where users connect to run programs, access information, share data and store critical information. Server security involves many IT processes and technologies including, operating system set-up, host intrusion detection, host intrusion prevention, authentication, authorization, application security, storage security, privacy issues, data encryption, hierarchical user and management security controls, network access security, worm attack, virus attack, download/upload security issues and a myriad of other items including esoterica like digital watermarking and the like. Server security is traditionally the problem of the IT Department, which owns the server and the associated functions the server provides. In most IT departments, server security configurations are cursory at best, non-existent at worst (which is the norm) and even with the best intentions, usually fall into disrepair over time as other management expediencies take over in the life cycle of the server. Most server security efforts are afterthoughts or are the result of a server security breach, which affects the server’s operation or data. 3. Desktop/mobile. While many of the security issues found on servers also apply to desktop and mobile technologies, there are some unique issues such as client-to-server security, VPN and data sharing issues that require client-side security tools which are not the same as the server-side or are administered by the server. Further, most desktop operating systems are not as full-featured, security-wise, as server incarnations and yet many mobile technologies (such as laptops and PDAs) can contain the most sensitive of corporate information on them when users travel and connect to completely unsecured networks such as hotel Internet access and wireless networks in airports and the like. In the case of security of UCP, the concentration is obviously on infrastructure and server security issues. 7

Security Comparison: Do-it-Yourself Rack-and-Stack Servers and UCP As described above, there are many logistical, cost, and management advantages to the UCP when compared with DIY Rack-and-Stack alternatives. However, focusing specifically on security concerns leads to the following areas of discussion: 1. Bootstrap security controls 2. Operating system security “hardening” 3. System set-up for secure operations 4. System administration controls and issues 5. User access and control issues 6. Storage security 7. Network access and control issues 8. Application security and associated controls 9. Logging and event analysis 10. Incident response 1, Bootstrap Security Controls One aspect of server security is the use of BIOS security controls that are available on some server hardware configurations. Typically, this includes a boot-time password that is only known to the system administrator (sysadmin), which will not allow the server to boot unless the password is entered. IT departments will use this technique in situations where the system console is remotely accessible and some form of base console access must be provisioned to ensure that no unauthorized party can cause the system to boot or re-boot as the result of an illegal access to system console resources.

• • • •

Do-it-Yourself Rack-and-Stack Boot password controls are not always available on all hardware vendor platforms Boot passwords are usually “weak” in structure. Even when available, boot passwords are often not implemented. If a sysadmin leaves and does not leave the password accessible to the company, the motherboard must be re-jumpered to allow 8

• •

Utility Computing Platform Boot passwords are not required as access to all UCP consoles is via a highly secure back-end network environment that is only accessible to authenticated and authorized personnel. Remote access is the default methodology and provisioned by SAVVIS in a secure manner. Remote access controls are on a user-by-user basis, therefore if a sysadmin leaves the company,

console level access to the server (override). Remote console access methods need to be provisioned over and above password facilities as server consoles do not include remote access by default

his/her account is deleted and a new, valid account is created and assigned to the new sysadmin.

2. Operating System Security “Hardening” All operating systems “out of the box” have a myriad of security capabilities that are typically not enabled. Further, many system components that are not needed for proper management, application execution, user access, etc., are left enabled in the default installation of the OS to allow a greater range of applications to run unimpeded and to reduce customer calls to the OS vendor’s helpdesk when trying to get applications working at installation time. The result is that most out-of-the-box OS installations have a wide range of attack exposures that if successfully penetrated could allow unauthorized persons to take control of the server. Security personnel may, optionally, “harden” the OS environment by spending a great deal of time closing down unwanted options, closing access methods that are not used, disabling applications which are not needed, deleting components which are detrimental to operations and security, ensuring user access is tightly controlled and many other related functions. Hardening of an OS is a tedious and time-consuming job and is NOT a default action by any server implementer. In fact, most of the time the OS is not hardened at all and most servers remain vulnerable to the myriad of methods hackers and others use to access servers through known “doors” and vulnerabilities associated with components that are not typically used by sysadmins, but which have the ability to allow an attacker to gain access and control of the OS environment. Part of hardening an OS is a continual re-evaluation of security controls, methods and techniques to ensure that as the OS is properly configured, security-wise, for the operations role the server is required to fill. It is also crucial that the OS is patched with the latest error corrections from the vendor to ensure that vulnerabilities discovered are properly closed down. Patching of servers, especially after they are operational, can be a painful and tedious chore as many applications can break due to OS patches and yet the patches are critical to ensuring the good health of the server in an operational state. One important consideration in patching of server OSs is often the size of the organization dealing with an OS vendor: the larger the organization, the sooner it is told of vulnerabilities and allowed access to patches. It is also commonplace for vendors to place beta copies and preliminary patch releases of their products into the hands of these larger customers to allow them additional testing time or to help them deal with 9

critical vulnerabilities more quickly than the normal patch distribution cycle might allow between the time a vulnerability is announced and the time in which patches for same are released. In all cases, however, hardening an OS requires security expertise. In most situations, credentialed security professionals do the hardening procedures due to their knowledge of security requirements and their usually in-depth knowledge of the OS functions, controls and computational environment. Hardening is not something most IT departments deal with as an improperly hardened OS will cause applications to fail as well as render the server useless if improperly done. A properly hardened OS by qualified security professionals is one of the major ways in which a company can dramatically and substantially increase its security profile for a server in an operational environment. Hardening is not a one-time-thing – it’s an all-thetime-thing and must be done by credentialed, qualified personnel to gain maximum effectiveness and to ensure the server will not be improperly configured and cause damage to the operations of the server.

• •

Do-it-Yourself Rack-and-Stack No default OS hardening done to protect the server. Any hardening that is done at installation is usually a one-shot operation at direct cost to the customer and is not automatically updated as the OS is patched and upgraded over time. Patching of the OS is not automatically done as part of the installation or operation of the server. Most customer servers are not “pre warned” by OS vendors of patches prior to public distribution of same.

Utility Computing Platform
• All UCP server OSs are automatically hardened by qualified, credentialed security professionals to protect the server from attacks and infiltration attempts. Server patches and vulnerabilities are continually updated and installed by SAVVIS by qualified system management personnel. UCP servers are automatically hardened, continuously, over time. SAVVIS often receives patch and update information by vendors of OSs prior to public disclosure of vulnerabilities due to SAVVIS size and number of deployed platforms. SAVVIS maintains “golden images” of hardened OSs from which server configurations are built for customers thereby further decreasing the risk of human error.

• •

10

3. System set-up for secure operations One inviolate rule of security is that if you want someone to stay off of the server, keep him or her away from the server console. The obvious physical ramifications aside, most servers these days require a virtual or remote connection to the console to provide for administration and ultimate control of the box. How the server environment is set up to allow remote administrative access is a major issue in security of the server. In most situations, remote console access is done via Secure Shell (SSH) protocol, which provides an encrypted session between the remote workstation and the virtual console of the server. In some cases, where there are several consoles that might require direct, hands-on access from remote locations, companies implement Terminal Server technologies or others such as Citrix technologies which allow Keyboard, Video and Monitor (KVM) emulation over a network to allow direct access to the server console as if the sysadmin where seated in front of the box being accessed. Both situations can be implemented via encrypted network connections, but still depend on a user ID and password to access the console of the system. If the encryption breaks down or some other open access is devised, a simple user ID and password are all that is required to compromise the system. When accessing a system remotely, an audit trail of commands is an essential part of good security to ensure that any situation that might be considered to be a console access issue can be verified. Most terminal server or SSH options do not enable logging of commands in most do-it-yourself rank-and-stack console access models. Physical access to the console is also critical to proper security set-up of the server environment. Most physical access to web servers provisioned by companies is limited to data center security that may be provisioned by the company’s facilities. This may include building card access and potentially data center card access systems, but rarely anything more sophisticated than that. Do-it-Yourself Rack-and-Stack • Access via SSH or terminal services to console facilities. • Does not typically log all commands given to the server via console access. Physical access is via data center card authorization only. Utility Computing Platform Access is via a sophisticated closed back-end networking environment that requires multiple levels of authentication to gain access. Logs all commands provided via console access. Physical access is via multiple layers of physical access that are

• •

11

User accounts are often duplicated per server or worse unevenly established according to each application environment thereby complicating security processes as employees leave the company. Custom, highly secure management network facilities are typically not provisioned to access system consolers – most access is via common network access which is available to anyone who has access to the IP address of the server.

dissimilar in nature. Physical card access to the building, followed by biometric authentication of a person, man-trap facilities, closed circuit camera scans and locked cages which house the server hardware and console access (which requires a physical key provided only to authorized customer personnel).

4, System administration controls and issues In utility computing, “console” takes on a new definition. In rack-and-stack server environments, the console is the “main terminal” where the system is bootstrapped and where all main events and alerts are sent by the OS to the system administrator. Consoles on servers are usually accessed via a keyboard, monitor and mouse directly connected to the server hardware. In many cases, remote administration is provided via virtual console access over the network via tools such as secure shell (SSH) or console management tools and utilities that provide emulation of a direct-connect Keyboard, Video and Mouse (KVM) environment. Because UCP environments are made up of an array of virtual servers, there are consoles for all of these systems that are used to configure the virtual server that a customer has requested. For example, the blade computing server that is used to provision the computational engines for a UCP has its own console for control and configuration of the blades for a server environment. The storage array management system, network configuration facilities and virtual security components all have their own consoles which are used by SAVVIS technical personnel to provision the virtual server that will be created and used by a customer. These “consoles” are called Element Management Systems (EMSs) and are typically provided by vendors with carrier-class solutions like the components of the UCP. Each utility computing component runs its own operating system, most of which are LINUX or LINUX-like environments on top of which the EMS is built. All have network interfaces for system administrators and all have hardware bypass capabilities in case some dire emergency takes place and direct connect hardware is the only way to get access to the console of the particular component. And, all of the EMSs impose a hierarchical security environment to ensure that the system administrators at SAVVIS 12

who control and configure the UCP components cannot just run amok and do whatever they want to do. Hierarchical security in a console context means that certain types of commands can only be executed by individuals with the right security credentials that are provisioned by the owner of the console. In all consoles, there is usually One Supreme Being (as far as the system is concerned) and from that Master Commander, all other lesser system administration commands are authorized. So, yes, there is the remote possibility that a system administrator who is Master Commander of the console could lose his/her mind and trash a component of the UCP environment. But, this is no different than rack-andstack where the same capability exists. The difference with UCP is that if the Master Commander were to lose his/her mind and decide to trash a virtual server via its console, multiple customers would be affected – not just the server being trashed. To keep that from happening, SAVVIS has provisioned something called the production operations network (PON, which has additional front-end security controls that restrict access and command sequence execution from even authorized Master Commanders who have access to UCP control consoles. SAVVIS engineers log in to front-end systems via two-factor authentication systems (something you have, something you know). The authentication systems then impose a rule set on the system administration technician, which restricts what they can and cannot do within a UCP console environment. The authentication systems have two network connections: one to the outside world and one to a highly restricted internal network (no external or Internet access is allowed) which has multiple layers of security and access to the UCP box consoles at the core. While a sysadmin is logged in and working on the UCP consoles, all commands, events, text from programs and other data is logged to files on the frontend console authentication systems. The authentication system logs are periodically reviewed by internal security teams to ensure that there is a consistent behavior exhibited by sysadmins and to monitor for potential events or problems that the monitoring systems may not have picked up (not all automated systems are flawless) to ensure security quality control. For instance, a new version of console software by a vendor of one of the UCP components may register a new command type that is not in the command filters in the front-end security authentication consoles. Review of command sets would show that a sysadmin may be trying to use the command and it is being filtered by the front-end security authentication system and is being refused access to the command. This means that the authentication system is functioning as it should – not allowing unauthorized commands – but probably needs adjusting so that the sysadmin can take advantage of the new provisioned command from the vendor to do a better job of administration of the UCP component. Within a UCP EMS, there are additional security controls such as firewalls, intrusion detection capabilities, hierarchical security controls in the consoles themselves (only 13

certain command types allowed on a per-user basis once a sysadmin has logged through the PON authentication servers, internal network security components and gained access to the UCP server consoles) and the layered security of the internal network itself (limited command set access for sysadmins, limited physical access to servers, two factor authentication requirements, etc.). In short, getting to UCP consoles is highly restricted, heavily monitored and command sets are hierarchically protected via security controls at the PON authentication servers as well as on the consoles themselves of the UCP components. With multiple authentication styles imposed on a sysadmin who has a need to access UCP consoles, as well as the use of encrypted connections via SSH V2.0 or better, the ability to scan a password is highly negated by encryption, but even if a password were to be stolen, without other authentication factors and connection authentication styles that make, specifically, a sysadmin’s profile, access to UCP consoles is, basically, nil. But what about access to the server OS’s sysadmin capabilities? Once again, there are differences when compared to the classic rack-and-stack computing model. In typical rack-and-stack environments, there may or may not be a firewall in front of the server. In most cases, even if a firewall is provisioned, it does not have its rule base specifically adjusted for secure access by a sysadmin. In the case of UCP, a virtual, stateful firewall is provisioned as part of the configuration where Access Control List and firewall rules can be configured to allow sysadmin SSH access from specific network locations and in specific windows of time, restricting who can attempt a connection to the OS system console from a remote location. This is set up with the SAVVIS configuration team at the time of virtual server definition and allows the customer tighter control over OS console access than is typically the case with most servers which are set up for network sysadmin console access. To help control unauthorized console access via vulnerability exploits, worm attacks and virus execution, SAVVIS maintains OS patches and fixes from vendors, automatically, while SAVVIS provisions the services of UCP (customers may request exceptions to this via their service level agreement, if needed). This means that holes and vulnerabilities that are typically exploited by the opposition are quickly and transparently patched and monitored by SAVVIS system administrators and security control teams to ensure that the server environment stays healthy. 85% of all vulnerabilities on servers are caused due to lack of patch implementation – most hacks and worm attacks are successful because sysadmins do not keep up with the latest patch applications on their OSs and applications. Because UCP incorporates aggressive patch management on all consoles and virtual servers, customers need not worry about patching OSs and supported applications in UCP environments – it’s part of the service, transparent and automatic. This helps protect the potential take over of a system’s OS console by the use of a vulnerability to push an exploit on the server and take over the console subsystem in some manner for evil purposes.

14

Do-it-Yourself Rack-and-Stack
• Usually implements “flat” security model where sysadmin has total and exclusive rights to all commands, sequences and facilities. Allows console command injection from any network location. Usually implements passwords as the default user identification method. May or may not impose an encrypted connection for console access. Does not use a private network for console access to restrict access to console resources. •

Utility Computing Platform
Implements a highly structured hierarchical security set of facilities that imposes per-command filters and restrictions on sysadmins to control access to critical resources. Restricts command execution to only specified locations on console network connections. Implements two-factor authentication for all system administration work account access. Always requires an encrypted connection for console access Uses a private, defense-in-depth console access network that is highly secure and not connected to any public or external network resources.

• • • •

• •

6. Storage Security

An area that always seems to generate security questions is storage security in UCP. UCP uses a shared physical storage array, which is one of the major reasons costs for the system are so low. However, physical co-location of storage does not translate into insecure or viewable storage by multiple customer components. A storage array in UCP is not like Network Attached Storage (NAS) systems that use classic computer networking to connect disk drive arrays to a blade processing server system. In UCP, all disk storage arrays are attached to the computing blade servers via Fiber Channel facilities. Fiber Channel is a highly structured disk connectivity fiber array that provides structured access to devices connected to the fiber facilities within the storage array. The combined network and storage devices are call a Storage Area Network (SAN). SANs support very high throughput and are also very secure. Because Fiber Channel architecture does not use classic network protocols, like IP, it cannot be accessed like the Internet or intranet might at a company. Fiber Channel devices must be physically co-located within the reach of the fiber array and devices on the fiber channel facilities are assigned very specific identifiers, called Logical Unit Numbers (LUNs), which are used by the Fiber Channel logic control facilities to move data between LUN components. This all means that common annoyance attack profiles, like man-in-the-middle attacks, can’t happen on Fiber Channel arrays, nor are they possible to execute from classic networking facilities such as Internet or other IP-centric network 15

types. Because the storage connectivity does not use IP networking in any way, DDoS, DoS and other types of attacks are ineffective against Fiber Channel arrays. Fiber Channel arrays are built to be within feet of the disks and the Fiber Channel console – not over wide geographic areas like classic fiber networking facilities. Another area of interest when considering security attributes is the Fiber Channel zone. Zones are an administrative way to restrict which Fiber Channel console can own which LUNs in a fiber channel array. UCP storage consoles only allow one master zone, which is owned by a single console. Therefore, Fiber Channel arrays which are provisioned by UCP are owned by a single master and cannot be shared between consoles. While this restricts flexibility to an extent, it also lowers the overall security access domain and reduces command entry and control points in a major way, strengthening security significantly. A common concern is the opportunity to “view” shared data on disk spindles. Again, a low concern in UCP. LUNs are not shared and are allocated to specific computing blades in the computational component. The computational blade server, due to its architecture, design and coding, requires that it “own” the Fiber Channel zone to which it has access. This means that, ultimately, the computing server console for the computational blades determines what blade(s) have access to which LUNs in the storage array – not the storage console server. This forces access security to only those LUNs allocated at configuration time to a specific set of blades that are configured for a specific server environment. In some very high speed computing environments, specific disk spindles must be allocated to specific server blades due to disk indexing needs (such as with very large databases) or in cases of extreme security paranoia by a company planning on using UCP for their storage needs. The UCP platform can be configured so that specific disk spindles are allocated to specific LUNs, which are allocated to specific UCP computing blade resources. This is akin to bolting the disk spindles to a specific computing blade server – it doesn’t get any more secure than this. In all cases, the shared storage system used in UCP environments is highly segmented and provides security exclusionary isolation via disk partitioning, Fiber Channel LUN allocation and potential allocation of specific spindles to specific blades in an UCP environment. In contrast to rack-and-stack environments, disk access is allocated to a specific motherboard’s disk channels. Expansion of disk storage can be quite painful, especially when the disk bus facilities run out of unit IDs and external storage must be installed. 16

From a security perspective, rack-and-stack storage security is equivalent in many ways to UCP storage security. On the negative side, rack-and-stack storage flexibility falls horribly short when storage expansion is needed, quickly, on a server environment. Do-it-Yourself Rack-and-Stack • Reasonably secure if stored in a controlled, locked environment. • Disks are allocated to a specific motherboard disk channel facility that restricts sharing via physical means. Utility Computing Platform Provisioned in a SAVVIS data center with a high degree of physical security. Can be configured such that disk spindles are specifically allocated to specific computing blade server components, which is similar in nature to rack-and-stack physical disk channel security.

7. Network Access and Control Issues With the advent of client-server computing and the migration away from the mainframe style of computing, networking capabilities are essential. Client-server computing could not exist without it. As networking capabilities have developed over the past several years, so have threats against servers that are network-based. The most common method of implementation of network security controls is via a technique called defense-in-depth. This basically means that a server facility is protected by multiple layers of technologies that detect and filter attacks that appear over network resources. The security puzzle gets a bit complex in the decision matrix required to identify the best layers to implement and in what fashion to create the best security controls for the assets that need to be protected. In most rack-and-stack configurations, a firewall and potentially an intrusion detection system (IDS) are configured as defensive technologies to protect a server. The IDS is used to detect varying attack types and the firewall rule base set-up and adjusted to defeat the attack types experienced and to deal with most common attack patterns of Denial-of-Service (DoS) and Distributed-Denial-of-Service (DDoS) attacks. These techniques are useful, but they will not stop all attack types and they have some vulnerabilities of their own – like how is the security of the firewall rule base managed? How are the IDS rules and pattern databases updated? How do you know when to update the firewall with new filters to defeat various attacks that may be working their way around the network?

17

Accordingly, best practice for network system security requires more than a firewall and IDS. To be truly effective, the firewall and IDS need to benefit from something called network situational awareness – in other words, someone, somewhere, needs to have a bigger picture view of what is going on in the network space and take proactive actions accordingly to defray attacks from hitting the server on the network. Network situational awareness can only happen if the network security components benefit from a global view of what else might be happening on a network. Having a bird’s-eye view of security events on a network is critical to properly protecting a networked resource and ensuring that its security is being proactively dealt with so that new network attacks are not successful in penetrating network defenses as they make rounds throughout the network. Network security access controls are provisioned as: 1. Static network controls. These are rules and filters that deal with items that a customer knows need to be disabled or blocked at all times. For instance, you might have a firewall disable application ports that will never be used by the server so that it does not have to bother with any traffic over them at any time – ever. Other static controls might be the set-up of a rule in the firewall where another server that works cooperatively with the server being protected is at a specific IP address – always – and can be statically programmed in the firewall to only respond to items to/from that address. Yet other static controls might involve the use of access control lists (ACLs) on network routers and switches to keep routing tables healthy by only accepting updates from specific route paths or disabling route paths that should never be used. 2. Dynamic network controls. This is a much more technically challenging category of protection to implement because systems using network connectivity may change their basic properties and the security technologies have to be agile enough to keep up with the changes to properly protect the server network assets. For instance, if an application were to use Remote Procedure Calls (RPCs), a firewall that is “stateful” and RPC-aware would be smart enough to detect that an application has negotiated an RPC session on a dynamically requested port, would open the port for that application session and would automatically shut the port down when it is no longer used by the application(s). Many firewalls on the market are not intelligent enough to deal with this type of dynamic configuration. As a result, network security administrators may be required to leave thousands of open ports available if RPCs are to be used. Dynamic controls can be very useful in helping keep these vulnerabilities closed to potential attack vectors as application mix and network access styles change over time. In both situations, however, there is a need to continually monitor and enhance the security technologies in systems and infrastructure to ensure that the network access controls are always up to date and properly protecting servers and other assets. In most 18

customer implementations of network security, the provisioning of a firewall and IDS, without someone analyzing security events on a 24x7 basis, is insufficient to provide real network defense. Further, most rack-and-stack configurations do not have a global view of many firewalls and many IDS activities to see other events happening on the network and being able to take actions to defeat attacks before they appear on the server’s network doorstep. One major benefit of UCP is that network security services, like firewalls and IDS services, are provisioned as managed services by SAVVIS. This means that there is a 24x7 team of security experts who are constantly and consistently monitoring all network events generated from the UCP firewall and IDS and constantly making critical adjustments as network events unfold. Better yet, SAVVIS has well over 10,000 firewall instances that are monitored globally on one of the largest IP networks in the world – the bulk of which is a network called AS3561, which is the original Internet backbone (currently owned and operated by SAVVIS). By monitoring so many firewalls and IDS sensors, SAVVIS security teams can detect bad situations elsewhere on the network and automatically adjust the network security controls of a UCP installation before the bad situation appears at the server’s doorstep. This bird’s-eye view of the network can be critical when dealing with rapidly moving worm and virus infections where minutes can mean the difference between successful protection and exploitation. Another major benefit of a fully managed network security control environment is the maintenance of security technologies. The first lines of corporate defense in cyberspace are the security components, such as firewalls, IDS, etc. These products require maintenance like any other technology and usually need their maintenance much more carefully monitored than servers and desktops. Since the time from the announcement of a vulnerability to the deployment of an exploit is now in the hours range, the need to keep security technologies up to date and properly functioning 24x7 is a critical component to effective network security control.

Do-it-Yourself Rack-and-Stack • May not provision network security components such as firewalls, IDS, etc. • • Does not necessarily provide 24x7 support for security components. Does not provide a “global view” of security events on other security devices on a global basis to take advantage of proactive efforts to

Utility Computing Platform Provisions firewall, IDS and other technologies as needed by customer’s requirements. All managed security components have 24x7 support by credentialed security technicians. A “global view” of thousands of security devices and events is provided via managed security services to provide highly proactive

19

defeat attacks. • Does not necessarily provide for 24x7 security expertise on networking and security resources to monitor, in real-time, attacks for action/reaction. •

security service to UCP systems. 24x7 real-time monitoring of security events provides for effective action/reaction to security issues on network resources affecting the UCP components.

8. Application security and associated controls One of the most overlooked and easily breached aspects of client-server computing is at the application level. Many applications are designed without proper consideration of security issues related to entitlement, security, and access control and therefore create a vulnerability in the way they store and retrieve data that may be sensitive and critical to business operations. Application security is best when design teams consider security from an application architectural view and “build-in” security. By making software secure at the time of writing, much better control over application security is achieved. As a byproduct, secure coding practices lead to much greater software reliability. Structured security methodologies for writing code such as Security Software Engineering Common Maturity Model (SSE-CMM) can be used by programmers to better structural engineering and development processes to create more secure and reliable code. Because much software is inherently insecure, a great many products have been produced to help secure insecure programs. One of the more popular techniques is to place an application “sandbox” system in front of an application that knows all about the application and can enforce a given set of rules and access rights to the programs running on a server. Sandbox products function as a proxy firewall for specific applications for which they have been specifically programmed. Since sandbox products are very application specific, they are of limited use in most server environments. A more common approach to application security is a multi-layered security effort: 1. Ensure the operating system is “hardened” from common attacks and that unnecessary processes and other components are removed so they cannot be used to exploit the system. 2. Ensure firewall facilities have been adjusted for the rules to protect the application and server environment and block all traffic on unused ports.

20

3. Be aggressive in patching applications, operating systems and security components to ensure that the latest vulnerabilities are quickly isolated and patched to prevent their use in attempted system exploits. 4. Establish proper security controls between front-end and back-end systems to ensure that a front-end server cannot be used to disrupt, infiltrate or attack a back-end system. 5. Authenticate users on a user-by-user basis using two factor authentication and, preferably, strong authentication through the use of soft or hard token authentication (or biometrics). 6. Ensure that application security controls are hierarchical enough to restrict the most privileged data access modes to only those who require this access and restrict all access to unauthorized entities. 7. Provide access to applications via encrypted methods such as SSL and SSH to protect the privacy of accessed information as well as provide positive authentication of source locations accessing the application(s). 8. Ensure data on the server is protected or preferably encrypted in such a manner that it may not be copied or accessed by unauthorized entities. 9. Ensure applications are “blueprinted” via hashing or other algorithms to ensure that proper module versions are installed and that only authorized applications are executing on the server (and they have not been hacked or patched in an illegal or unwanted manner). 10. Applications that are custom written are subjected to security peer review. This is when security expert engineering personnel use code tools to search for vulnerabilities, poor security coding practices and to ensure that software is written in a manner not easily exploited. UCP provides superior application security via the basic set-up of the server environment, 24x7 management of the server, continuous vulnerability patching of the OS and related functions and support of specific popular applications, such as Oracle, which can be complex to harden, patch and manage. In addition to hardened security configuration, UCP also provides a rapid, very strong methodology to provision front-end and back-end servers in an applications environment as well as servers that can be used as application sandbox facilities. Instead of taking weeks to provision and set-up all the various front-end, back-end and sandbox servers for an application environment, it can be done very quickly in a virtual manner. This can be an important aspect of application support when an application dies for mysterious reasons and duplicate systems must be very rapidly provisioned. Or, in situations where test systems need to be provisioned before production.

21

Do-it-Yourself Rack-and-Stack • Does not pre-prepare the server environment for applications security by “hardening” the OS. • Difficult to configure front-end, back-end or test application systems. Does not include 24x7 system support and alarm management.

• •

Utility Computing Platform Automatically includes OS server hardening as part of the installation. Provides for rapid provisioning of test and front-end or back-end systems. Provides 24x7 system and application alarm/alert support.

9. Logging and event analysis One of the more critical aspects of server security management is the logging of system, application and network events and the analysis of these logs for potential problems and issues. Different components of a server environment generate event messages in these logs the study of which may help prevent integrity and availability problems in the future. One of the basic problems of event management is that there are so many events that many sysadmins become overwhelmed and desensitized to the contents of these log files and hence fail to use them properly or at all. Because these logs often provide the first hint that a breach or vulnerability exists, this situation dramatically affects overall application security. For UCP facilities, SAVVIS provides not only 24x7 support via experienced system administrators, network experts and security experts, but also via SAVVISStation facilities that will monitor and sift through events from event generation facilities and identify those that require immediate assistance. For rack and stack configurations, such service is always at additional cost to purchase and provision alarm/alert management tools, provision people to monitor events and also to correlate events and sift through to find the important ones. For network security products, such as firewalls and IDS, event monitoring and management is critical to early detection of infiltration attempts and malware like worm attacks, Because security products generate many events, it’s not very long after installation that sysadmins and netadmins start to ignore events from the security devices as it takes too long to go through the logs and analyze events for interesting items. SAVVIStation does all of this automatically and provides a reliable, 24x7 monitoring and event correlation facility that doesn’t sleep, doesn’t get bored and doesn’t become distracted.

22

Do-it-Yourself Rack-and-Stack • Does not include automated event management. • • Does not include 24x7 expert support for events and log analysis. Does not include event correlation between system, security and network events and what the various alarms/alerts might mean to the server environment.

Utility Computing Platform Includes automatic, automated event management via SAVVIStation facilities. Includes 24x7 expert support for systems, network and security components. Includes event analysis and correlation services between event types to properly analyze events and take corrective action before an event becomes a critical problem.

10. Incident Response Statistics show that a great many of all servers are compromised on an annual basis. This means that even low target servers will probably be breached by a “bot” or hacker in a 23-month period. ECommerce sites are even at higher risk due to their dependence on networking resources and the use of them as a weapon by organized criminals (who use them to DDoS attack a server environment for extortion money). This means that most servers will be attacked and most companies must plan for dealing with the aftermath of an attack. That is where incident response services become very important. Rack-and-stack server configurations are typically managed by IT shops at enterprises where security resources are typically scarce. Even if competent help is nearby or on staff, when an incident occurs companies are still not typically equipped with the right skills to stop an attack, clean up, protect from additional attacks, perform forensics and work with law enforcement to track down and catch those who may attack the server assets. Incident response requires constant training on the latest attack methods and current skills on forensic techniques to find out where an infiltrator has parked Trojan horse code or other forms of malware. SAVVIS security services (S3) provides professional and managed security services 24x7, including very skilled incident response services. SAVVIS deals with hundreds of attacks per month and has a long-term, deeply skilled security force of experts in forensics, evidence preservation and many other critical incident response skills that are essentials when an attack occurs and minutes count. Many are ex-law enforcement specialists, intelligence agency experts and possess the needed skills to properly deal with all aspects of cyber attack. 23

Do-it-Yourself Rack-and-Stack • If a security team is on-staff, typically not skilled in real-time computer forensics, investigation, evidence gathering and restoration methods and techniques.

Utility Computing Platform Provides a highly skilled, 24x7, deeply experienced incident response team on a global basis that have proven, long-term experience in the various aspects of incident response following cyber attacks.

Conclusions As stated at the outset of this paper, effective system security results from a combination of process maturity and platform facilities. UCP platform facilities are superior in control and monitoring when compared to rack-and-stack computing solutions. While it is obviously possible to provision the same services and technologies for rack-and-stack computing, it is highly unlikely that all requisite services will be reliably and affordably. SAVVIS’ security process maturity then leverages and supplements these facilities as application environments are implemented, monitored, and managed. SAVVIS has spent several years working on a full service utility computing solution that covers not only a one-for-one hardware solution, but also include many of the “soft skills” required to ensure proper coverage of security for the platform and components.

24

About SAVVIS SAVVIS, Inc. (NASDAQ: SVVS) is a global IT utility services provider that focuses exclusively on IT solutions for businesses. With an IT services platform that extends to 47 countries, SAVVIS has over 5,000 enterprise customers and leads the industry in delivering secure, reliable, and scalable hosting, network, and application services. These solutions enable customers to focus on their core business while SAVVIS ensures the quality of their IT systems and operations. SAVVIS’ strategic approach combines virtualization technology, a global network and 24 data centers, and automated management and provisioning systems. For more information about SAVVIS, visit www.savvis.net. The information presented herein is to the best of our knowledge, true and accurate. No warranty or guarantee expressed or implied is made regarding the capacity, performance or suitability of any product or service. This document is protected by copyright and may not be reproduced, published, communicated, modified, commercialized or altered in any form nor by any means without the prior written authorization of SAVVIS.

25