Professional Documents
Culture Documents
Prepared for:
IBM Corporation
By:
Sine Nomine Associates
43596 Blacksmith Square May 1, 2007
Ashburn, VA 20147
Avoiding the $25 Million Server
_________________________________________________________________________
All rights reserved. Trademarks and brand names mentioned in this document are used only for
identification purposes. Sine Nomine Associates acknowledges the ownership of all product
trademarks by their respective manufacturers.
This report was developed by Sine Nomine Associates with IBM cooperation and funding. This
report utilizes publicly available information provided by various companies and sources,
including IBM. The opinions are those of the report's author, and do not necessarily represent
IBM's position on these issues.
Document History
Revision/
Date Nature of Change
Reissue
Table of Contents
1 INTRODUCTION............................................................................................1
1.1 Constraints on Data Center Growth ......................................................................................... 1
4 CONCLUSIONS...........................................................................................21
4.1 Power and Cooling .................................................................................................................... 21
BIBLIOGRAPHY ................................................................................................41
List of Tables
List of Figures
Figure 3: TCO per logical server for one, three, and five years .................................................... 17
1 Introduction
In this paper Sine Nomine Associates has analyzed the problem of functional density in a modern
data center, considering traditional rack-mount discrete servers, new-generation blade servers,
and virtual servers on both Intel and System z mainframe platforms.
This paper presents the results of deploying each of these designs in a hypothetical urban-area
data center, owned by a single large enterprise or shared by multiple medium-sized enterprises (as
would be the case in a hosting provider). This scenario was selected to illuminate an emerging
constraint on Fortune 500 IT: the fundamental operational limits of existing high-value data
center space.
Modern data centers have benefited from, and have been impeded by, recent technological
changes in computer architecture. Faster processors, denser mass storage, and inexpensive
memory chips allow more and more work to be delivered from a single system. On the other
hand, as applications grow more complex, more resources are demanded from each application.
Also, modern semiconductors1 consume power in direct proportion to their clock speed and to
their gate count. As both circuit complexity and clock speed increase, the result is a greater-than-
linear increase in power consumption per server.
Additionally, rack density has increased dramatically in recent years. Servers that formerly would
have occupied two or even three rack units (2U or 3U) now fit into a single 1.75 inch rack unit
(1U). Blade servers increase the density beyond 1 server per unit by orienting the circuit boards
vertically and cooling the system with carefully designed, highly efficient airflow components. A
typical blade server can pack 14 blades into 9 rack units, while another packs 16 blades into 10
rack units. The density increases, but the power per server is largely dependent on processor and
memory, not board configuration, and therefore the required power per rack increases
significantly in today's environment.
Although some data centers are located in rural areas where real estate is relatively inexpensive,
many are found in densely populated urban areas where real estate is extremely expensive or
unavailable at any practical price. Nevertheless, urban data centers remain vital for organizations,
because they already are in service and therefore incur only recurring costs rather than initial
acquisition costs, and because they are close in space and time to other operational centers of the
company, its customers and partners. In both urban and rural settings, there is usually a limit to
how much electrical power can be delivered to a given facility. In the rural area, it might (or
might not) be possible to "throw money" at the utility company to obtain a larger power feed, but
in urban areas there is often simply no more capacity to be had at any price.
This is important because data centers are ultimately constrained not just by space, but also by
power. This leads to the problem of the "$25 Million Server."
1
This statement refers to metal-oxide semiconductor (MOS) chips, rather than bipolar junction chips. MOS
(and CMOS) chips are used for almost all microprocessor and memory circuits, with bipolar chips now
being used only for ultra-high-speed functions such as multi-gigabit-per-second network transceivers.
Data processing needs of companies typically grow over time, as the company grows in size and
as the demands of the marketplace lead to increasingly complex and numerous systems. Given a
data center of a certain size, with a limited amount of power and cooling available, the density of
servers steadily increases. Eventually, it simply becomes impossible to deploy one more server in
that data center. For a well-managed data center, under disciplined administration, this point
occurs when the nominal power capacity (less a prudent safety factor) is reached. For companies
that fall short of perfection, this point occurs when the already-overbooked power and cooling
systems finally fail, leading to overheated servers, tripped breakers or shutdowns, or brownouts.
Whether the breakpoint is a prudently chosen safety stop, or an empirically determined failure
point, it is inescapable, and the company's only choice is to build a new data center. Thus, in
effect, that one additional server resulted in an expense of tens of millions of dollars, and has
become known in data center design parlance as "the $25 million server." The purpose of this
paper is to identify ways in which careful selection of server platform can push the $25 million
server as far into the future as possible.
This paper examines multiple configurations for each of three different platform types: discrete
servers, blade servers, and mainframes, comparing their strengths and weaknesses.
As microcomputers2 began to appear in large-scale data centers, the natural tendency was to
reshape a traditional PC mechanically so that it would fit into a standard 19 inch equipment rack.
Early PC servers occupied 4 standard 1.75 inch rack units (vertically) each, or perhaps even more,
but advancing technology has reduced typical servers to 2U or 1U sizes. Currently, a discrete
microcomputer server packs two or more processors into a 1U chassis.
To push for even higher density, server manufacturers took a cue from telecommunications
equipment makers and began offering blade servers. These use vertically oriented plug-in boards,
with one server each, in an enclosure that is typically 7U to 10U in height. Sixteen compute nodes
in a 10U chassis, for example, results in a density of 1.60 servers per rack unit, a 60%
improvement over standard 1U discrete servers. Blades also offer the advantage of internal
network switches and (frequently) storage area network (SAN) switches, so fewer external switch
ports are needed for a given number of servers. Typically, a blade deployment needs only two
network and two SAN ports per blade chassis, not per blade, whereas discrete servers need two of
each port per server.3
Both discrete and blade servers can be operated in either physical or virtual mode. Physical mode
dedicates a compute node to each logical server, and provides maximum isolation at the expense
of a great deal of underutilized resources, since a typical modern CPU is much faster than all but
the most intensively compute-bound applications require. Virtualized microcomputer servers
offer strong isolation in software, with a hardware assist, and allow resource sharing and leveling
2
In this paper, the term "microcomputer" refers to any computer in which the central processing unit is a
microprocessor. All x86-based servers are considered microcomputers in this context.
3
This assumes N+N redundancy, the most common configuration.
between logical servers on a single physical platform. Virtualization also offers advantages in
manageability and in business continuity planning, because virtualization tools such as VMware
allow management of the virtual system images in a robust way.
Mainframes such as the IBM System z take consolidation with virtualization to its logical limit,
concentrating dozens, hundreds, or even thousands of servers into a single physical machine.
Although it is theoretically possible to run a single system image on a mainframe, practical
experience shows that this is the optimal solution in only a few very specialized situations.
Mainframes offer full virtualization at the hardware level, and they are optimized to have load
leveling and resource sharing managed by the hypervisor, not the guest operating system or
application.
Whereas a typical microcomputer has an extremely fast CPU and often a relatively modest I/O
capability, the mainframe has processors that are only modestly faster than a microcomputer, but
I/O capabilities that far outstrip those of an ordinary microcomputer. The fabrication budget and
larger physical size budget of a mainframe allow ultra-wide internal data paths with sophisticated
"channel" controllers that offload almost all of the device management and data transfer work
from the main CPU to a dedicated I/O processor. Mainframe CPUs spend much less of their time
servicing low-level device interrupts. Even so, the mainframe is not suited for extremely
compute-intensive tasks such as ray-trace rendering of 3D graphics.
Each of the three platforms has its benefits, and its drawbacks, and this paper looks at all three in
a hypothetical enterprise data center with an aggregate workload that closely simulates workloads
Sine Nomine Associates has observed in actual customer sites.
System administrators often separate servers for reasons other than a need for additional hardware
resources. Organizational boundaries may require (or encourage) segregation of servers along
departmental or project lines, even though a single server could have been shared from a purely
technical standpoint. Servers may be dedicated to major customers or business partners due to
contractual constraints, or because these external organizations need administrative-level access
to certain applications. Some applications require separate servers because they are incompatible
with other software that might logically coexist on the same server if not for that fact. Finally, for
applications with low average utilization and high peak utilization for brief intervals, it is
necessary to provision dedicated servers to meet the peak demand, rather than the average
demand that dominates the server's workload.
The result of any of these situations is a data center full of dedicated servers that are
underutilized, albeit for very good business reasons, and this leads to low productivity of the
energy used by the systems in aggregate.
Virtualization allows underutilized physical servers to be consolidated in a way that maintains the
isolation of their operating systems, application runtime environment, security context, network
broadcast domain, and administrative authority. Load leveling in the virtual data center enables
planners to share the peak-handling overhead among applications whose instantaneous workloads
are unlikely to peak at the same time. Having fewer servers in the data center proportionally
reduces the power consumption of the electronics and of the associated cooling machinery,
resulting in substantial savings of recurring costs.
The basic question this model answers is, "Given that the data center cannot expand in space or
power, how can the enterprise expand to meet future needs, and what is the best platform to allow
maximum growth potential at the least cost?" The intent of the model is to determine how best to
avoid the "$25 million server" problem.
The model will show how to avoid this scenario through server consolidation onto virtualized
hardware, including both microcomputers and System z mainframes.
The model simulates an urban-area data center of a given size, with the presumption of no room
available onsite for further expansion and little or no additional electrical power available at that
site. This may sound artificially constrained, but Sine Nomine Associates has worked with many
customers in New York City, Los Angeles, Chicago, and Washington DC areas who face this
very situation.
A single large enterprise with a company-owned data center is the focus of this study, though
most of the conclusions are applicable to a collocation facility shared by several medium-sized
enterprises. In all scenarios considered here, the model assumes that the company has broadly
deployed storage area network (SAN) technologies with efficient aggregation and management
capabilities. The space, cost, and power required by the SAN arrays and management, and data
backup systems, are not included in this model, on the principle that the quantity of enterprise
data storage is not significantly affected by the choice of processing platform. This is the one
assumption that may not directly translate between a single large enterprise and multiple medium-
sized enterprises, although one could presume multiple smaller SANs as well.
All of the servers in this model run Debian Linux, so that the cost of the operating system is
equalized (at zero, in fact) between all platforms.4 Hypervisors (VMware for the Intel platforms,
and z/VM for the mainframe) are included in the pricing calculations of this model, however.
4
Since Windows can only run on Intel servers, there is no reason to compare Windows across Intel and
mainframe platforms. If an enterprise runs a mixed Windows and Linux environment, this model can still
be useful -- just assume that the Windows servers exist in a separate section of the data center whose square
footage is not included in these calculations, and that the Windows environment is not subject to migration
from one platform to another.
• Constants that represent the hardware characteristics of the various platforms and
configurations being compared.
• Constants that represent the capabilities and limitations of the hypothetical data center,
such as its physical size, available power per unit area, and other parameters.
• Constants for occupancy factors defining the efficiency with which the "white tile" space
in the data center can be utilized, amount of cooling energy required per watt of
computing energy, and similar factors.
• Costing figures that capture the national average cost per kilowatt-hour of electricity, the
costs of hardware and software for the various platforms and configurations, and the cost
of data center construction per unit area and unit of available power.
• Constants that represent typical loading factors for various components of a typical server
under various common application loads, and the relative frequency with which each of
these server types is likely to occur in an enterprise data center.
Wherever possible, the input constants have been obtained from specific manufacturer
specification datasheets, well-respected published reports, or internal findings developed by Sine
Nomine Associates in capacity planning and performance measurement projects. In a few cases,
such as the total size of the hypothetical data center, an arbitrary constant must of necessity be
selected, and these were chosen based on Sine Nomine Associates' extensive experience with
multiple large- and medium-sized customers ranging from Fortune 500 financial companies to
mid-sized insurance and manufacturing enterprises.
The model was deliberately broken down into sections that are decoupled, as much as possible,
from one another. This makes debugging easier, makes the model easier to understand, and keeps
it as configuration-neutral as possible. The sections of the model include the following:
2.3.1 Globals
The Globals page provides constants, and wherever possible the numbers were generated from
objectively verifiable external sources. Constants on this page include standard units per rack, the
arbitrarily chosen size of the hypothetical data center, and electricity cost data (from the United
States Department of Energy).
There are occupancy constants that reflect the desired number of empty units (for future
expansion) per rack, and the "IT yield" ratio which defines how much of the data center's "white
tile" area can actually be occupied by racks. This figure was obtained from an Uptime Institute
report based on statistics gathered from numerous actual data centers. The IT yield compensates
for service aisles, building support columns, electrical and network wiring chases, cooling ducts,
and in-room configuration workspace.
The Globals page also contains parameters for the non-platform infrastructure components, such
as network switches, storage area network (SAN) switches, and keyboard-video-mouse (KVM)
switches. These items are presumed to be selected independently of the processing platform. The
choice of platform will cause the number of needed Ethernet, SAN, and KVM ports to vary, but
in this model it does not change the type of device selected for each of these roles. The
infrastructure parameters also include a multiplier that determines the level of redundancy of each
infrastructure component (2X for Ethernet and SAN, 1X for KVM).
The Globals page is specifically constrained so that its data, whether calculated or constant,
depends on no field from any other page. The intent is that the global constants are independent
of platform and workload. The pricing data that appears on this page is per-unit pricing for
specific components, with no extended totals based on quantity anywhere on this page.
2.3.2 Platforms
The second page contains the hardware and costing details for each platform, without regard to
the workload. A few fields are calculated based on the global constants, but no fields on this page
are in any way dependent on the workload or scenario pages.
The Intel systems were all configured and priced to use Intel Xeon Woodcrest 5140 processors
running at 2.33 GHz clock speed. Other configurations of the same server brands can use faster or
slower processors, but this particular configuration was selected because it was available across
all of the servers being compared, and is a very common implementation for corporate data
centers. The basic configuration of each Intel system type is considered a typical build-out for a
standalone Internet or intranet server, whereas the expanded system contains an additional
processor and much more RAM, so that it is more suitable for a virtualized platform.
The blade servers considered in this model consolidate up to 10 processor blades into a 7U
chassis, for a density of about 1.43 servers per rack unit. The non-blade Intel servers, of course,
offer a density of exactly 1.00 servers per rack unit.
The pricing for the Intel systems was obtained from the manufacturer's online catalog and from
quotes obtained from their presales team.
In the case of the IBM System z mainframes, three configurations are considered, but in all cases
the assumption is that z/VM will be used as a hypervisor to virtualize the platform. It is almost
universally agreed that the System z is not an economically viable platform for running a single
Linux instance, nor is it viable for most real-world applications running one server per logical
partition (LPAR).5 Thus, this model does not consider those configurations.
5
Logical Partitions (LPARs) are a way to divide a single physical system into a relatively small number of
isolated sub-units that can be viewed by the operating system as if they were standalone systems. LPARs in
the mainframe environment are typically used for gross system partitioning (e.g., test vs. development vs.
production) rather than fine-grained partitioning at the individual application level.
IBM does not publish list prices for mainframes, preferring to allow the customer to receive a
client-specific quotation from the IBM sales representatives. The prices reported here are a
composite of actual prices that have been paid by known customers, and published IBM prices for
components that do have a list price.
The quantities of network switch ports and SAN ports are differently calculated for the System z
than they are for the Intel systems. The Intel systems use a calculated value based on the number
of ports per infrastructure component, the redundancy multiplier, and the chassis size of the
components. This calculation, plus the mandated number of open slots per rack, iterates with the
size of the server enclosure to determine how many rack units are available for actual server
enclosures per rack. This, in turn, is used to calculate the number of servers that can be installed
in a rack.
For the System z mainframe, most of those calculations are not applicable. The System z has all
the SAN and Ethernet ports it needs, and a single locally attached console replaces KVM. There
is no need for a KVM-like switch function because there is only one physical machine in the
"rack". Also, rack-level calculations do not apply because the mainframe occupies one full rack,
period.
Note that neither the Intel systems nor the mainframes include upstream SAN and Ethernet
switching or routing. This has the net effect of giving a slight model bias to the Intel platform, but
the error is negligible because of the use of rack-local switches to fan-in the connections.
In the Platforms page, the processing capacity of each system is converted to mainframe MIPS
using the above figure, and these MIPS numbers are carried forward to the scenario calculations.
2.3.3 Workloads
The effectiveness of processing platforms depends heavily on the nature of the workload, and any
responsible expert will advise clients to carefully consider their specific workload when choosing
a mix of platforms.
For the sake of this model, rather than choosing a single arbitrary workload "per server", the
Workloads page catalogs a total of eight unique workloads and their typical resource
requirements, based on published benchmarks and monitoring data collected by Sine Nomine
Associates on various projects. This allows the model to flexibly simulate detailed scenarios for
comparison, allowing much more precise analysis of relative configuration strengths. The
workload types include:
• Static web server -- This server type assumes a traditional publishing-type web site, with
static and lightly dynamic content. Lightly dynamic content would consist of server-side
includes (SSI), blogs, or content management systems written in Perl or PHP.
• Domain name server (DNS) -- The DNS server is the quintessential lightly loaded
machine. DNS queries typically return only a small amount of data, and the data is read-
only except for occasional changes. This is almost perfectly optimized for cache
performance within the DNS server and within the operating system.
• Ecommerce web server -- In a multi-tier application, this server would represent the
customer-facing outer tier that processes HTTP queries from client browsers and
6
Originally, the acronym MIPS stood for Millions of Instructions per Second, representing the number of
binary-level operations that a processor could do per second. This, however, becomes a rather amorphous
concept on modern high-complexity processors because, among other things, instructions take vastly
different amounts of time to complete. A full thread-context switch obviously takes much longer than a
simple arithmetic operation, for example. In the mainframe context, then, MIPS has become a shorthand
terminology that represents a relative, rather than absolute, processor speed. In essence, MIPS is a way to
compare one mainframe model to another when predicting capacity of a proposed upgrade.
provides the presentation-layer logic for the application. This may include portal
components being merged into the overall page, AJAX processing of dynamic-content
form fields, and adaptation to browser differences. This layer specifically excludes the
core business logic of the application as well as database services.
The relative frequency of this profile is higher than that of the business logic server (see
below) because in many intermediate-complexity situations this tier may function without
a separate business logic server, connecting directly to the database server.
• File server -- A typical departmental file server performs very little processing, but a
great deal of network and disk I/O. The load imposed upon the file server is largely
independent of the protocols in use, so this workload profile could represent a server
offering NFS, CIFS (SMB), AFS, or some combination of these.
• Business logic server -- This represents the second tier of a three-tier client/server
application, residing between the ecommerce web server and the database server. The
business logic server in this profile is assumed to run a Java application in an enterprise-
grade container such as IBM WebSphere, JBoss, or BEA WebLogic.
• Management and/or monitoring server -- Servers that collect and analyze log
messages from other servers, and which are used as a host for monitoring and system
management applications, perform a great deal of network I/O and a moderately-high
amount of disk I/O. Typically, log aggregation uses efficient buffering algorithms to
reduce the amount of time spent on disk writes, but they cannot control the characteristics
of the inbound network traffic as closely.
• Email server -- The model’s final common server combination is the email server. In a
typical enterprise, a small number of email relays receive mail from the Internet, and a
small number transmit outbound mail to the Internet. Within the company, however,
there are usually a larger number of mailbox servers that hold user inboxes and archived
messages. The mailbox servers perform almost no processing, though the relay servers
spend much of their time scanning for spam, malware, and other undesired content. In all
cases, however, the I/O load of an email server is large. The profile shown in this model
is an aggregate assuming a few relay servers and a larger number of mailbox servers.
The individual workloads are aggregated into a weighted average workload, based on the relative
frequency of these servers in enterprises where Sine Nomine Associates has done other projects.
For example, static web servers and file servers are quite common, often occurring at a rate of
one per department, whereas domain name servers (DNS) are relatively rare because one server
can reasonably support thousands of users.
The load factor for each workload profile is segregated according to specific resources, including
the main processor, main memory (RAM or working storage), disk subsystem, and network
interface. The processor workload is further divided into application, kernel, and cryptographic
loads, with the kernel load representing the time that the operating system spends managing
physical devices such as disk drives and network interfaces.
The workloads are normalized to a discrete Intel server with one processor running at 2.33 GHz,
locally-controlled SATA disk storage, no cryptographic coprocessing, and ordinary Ethernet
adapters.
Finally, each workload profile includes a ratio estimating how much of the network traffic will be
server-to-server rather than server-to-client. This is important in the virtualized environment,
because much of the server-to-server traffic will be routed on virtual LANs that have extremely
low overhead. These ratios are based on Sine Nomine Associates' direct experience working with
Fortune 500 data center implementations around the world.
2.3.4 Scenarios
The final page of the model brings together a matrix of platforms and workload to produce the
model's scenario results. This page is the logical opposite of the Globals page, in that it consumes
data from all the other pages but does not feed back to any of them.
The scenario page initially calculates a "nominal" result for each configuration, based on its
theoretical density and capacity. The theoretical density is adjusted for infrastructure components
(network, SAN, and KVM switches) but not for power limitations of the data center facility.
This calculation provides a number of servers that could fit into the data center based on their
own architectural limits. In the case of the virtualized environments, this is also limited by the
basic factors of the aggregate per-server workload.
Real-world data centers do not have unlimited power, and it is quite common, especially in urban
areas, for data center density to be constrained not by rack capacity, but rather by power and
cooling limitations.
This section of the page takes this into account. From the Globals page, the page obtains a figure
for watts per square foot, and then derates the average rack density based on the total amount of
power available to the entire facility.
The platform adjustments section encapsulates the qualitative differences among platforms,
including:
• On both Intel and mainframe platforms in this scenario, some of the usual low-level disk
processing is offloaded because of the presence of the storage area network. The
offloading factor is higher in the mainframe because of an additional level of
coprocessing at the "channel" level, but both platforms receive a favorable adjustment
here, varying in degree only.
• On the mainframe, built-in cryptographic coprocessors offload the vast majority of SSL
encryption workload. The platform adjustments remove this from the overall CPU load
on the mainframe systems but not on the Intel systems, which lack dedicated
cryptographic coprocessors.
• Mainframes use dedicated coprocessors for network and disk I/O (in fact, there is no such
thing as a "dumb" I/O adapter in a mainframe). The platform adjustments account for this
offloading in the network and disk workload factors.
• The microcomputers rely on frequent interrupts to the main processor to handle I/O
devices, whereas the mainframe uses a channel-oriented I/O interface that accepts high-
level commands from the guest operating system. As a result, mainframes spend much
less of their time executing kernel-level code under Linux, and more of their time is
available for application-level processing.
The workload calculations are performed to determine, for the nominal cases, which of the load,
power, and architectural constraints will limit each configuration's total capacity per unit. The
result of this calculation is a limit of logical servers per chassis constrained by CPU, RAM, disk,
cryptography, network, or architectural limits. The final result is the worst-case (minimum) of
these values for each platform.
The adjusted workload calculations work just like their unadjusted counterparts, except that the
loading constraints take into account the platform adjustments explained previously. These are
the results, in the opinion of Sine Nomine Associates, which most accurately represent the real-
world capacity of the data center, accounting for all the factors that are considered in this model.
It is the adjusted workload calculations which feed into the TCO calculations that follow.
The TCO calculations, in contrast to the complexity of the other parts of this model, are very
straightforward, tallying fixed costs of acquisition for the first year (only), and then adding costs
of software support/maintenance and electricity for each year the systems remain in use. TCO is
computed for one, three, and five year intervals. No cost-of-capital (net present value) or
projected growth rates are factored into these calculations, because these are numbers determined
largely by corporate policy and accounting rather than by engineering factors.
3 Results in Detail
The calculations in the Sine Nomine Associates model yielded numbers which came as no
surprise to the SNA engineering team, who have worked extensively with both microcomputer
and mainframe platforms in customer data centers. As expected, processor resources constrain the
mainframe in all its configurations. This, combined with the notional limit of 8 mainframes per
data center, results in a total count of logical servers per data center that is much lower than if
microcomputers were used throughout. The lesson to be drawn from this is, quite simply, that one
would not want to replace the entire data center with mainframes for a typical aggregated
workload. Rather, the mainframe would be applied selectively to replace microcomputer servers
in those situations where its strengths are greatest. All of Sine Nomine Associates' customers who
use mainframe Linux are also using microcomputers where appropriate, deploying the proverbial
"right tool for every job."
The aggregated workload depends on the identified characteristics of each server category (web
server, database server, etc.) and on the frequency of occurrence of each type of server in a
typical database. The following figure shows the frequency of occurrence used in this model.
Static w eb server
File server
4%
Business logic server
7%
Management and monitoring
4%
34% server
E-mail server
The model is structured so that the notional "count" of servers in the aggregation process is not
significant; only the ratio factors into the result. Note that this in itself is not a model output, but
rather an input assumption presented here for reference only.
Weighted application
5% 20.23 MIPS
CPU load
Weighted working
282.95 282.95 MB RAM
storage set
Weighted disk
66% 258.21 unitless
bandwidth
Weighted crypto load 6% 23.71 unitless
Weighted network
86% 334.51 unitless
bandwidth
Weighted kernel CPU
8% 29.64 unitless
load
Weighted server-to-
13%
server net ratio
The aggregated workload represents the amount of work assigned to each logical server in the
data center, for the model. The percentages represent a ratio to the nominal full-load capacity of a
typical microcomputer with one Intel Xeon 5140 processor at 2.33 GHz and local disk storage.
The memory is a working storage requirement in megabytes, and the processor utilization is
converted to equivalent mainframe MIPS.7 All of the workload characteristics are adjusted within
the scenarios to account for multiple processors and architectural differences of the hardware.
The disk and network utilizations are considerably higher than processor utilization. Sine Nomine
Associates has found this to be the case in almost every site where we have worked. There are, of
course, exceptions with compute-intensive applications such as finite element modeling, financial
market models, and graphics rendering, but the quantity of such servers in a typical business is
small compared to the count of web, file, email, and database servers.
The maximum server densities are calculated based on workload, physical space, infrastructure
requirements, and power and cooling load limits of the data center in which they reside. In the
scenarios, the architecturally-adjusted workload factors are applied to these densities, resulting in
a set of constrained server densities based on each load factor. The worst case for each platform
thus establishes the final density limit for that platform in this data center, and those densities are
fed into the TCO models.
The table below shows the results of the constraint modeling, with the worst case for each
platform scenario highlighted. Note that the discrete microcomputer servers tend to be
constrained by architectural limitations rather than other factors; this is reflective of their under-
7
See explanation in footnote 6 on page 9.
utilization in the typical environment. The virtualized microcomputers in this model were
constrained by network bandwidth rather than by processor or memory limits.
IBM IBM
Dell Dell System z System z IBM
Dell PowerEdge Dell PowerEdge Business Business System z
PowerEdge 1950 PowerEdge 1955 Class S07 Class S07 Enterprise
1950 virtualized 1955 virtualized Small Expanded Class S18
Logical servers
per chassis
5.27 10.55 52.73 105.46 58.24 101.91 262.07
based on CPU
load
Logical servers
per chassis
14.48 115.81 14.48 115.81 115.81 231.62 694.86
based on RAM
capacity
Logical servers
per chassis
3.01 6.01 30.05 60.11 109.99 192.48 494.95
based on disk
bandwidth
Logical servers
per chassis
16.37 32.73 163.65 327.31 1197.87 2096.27 5390.41
based on crypto
load
Logical servers
per chassis
based on
network 1.16 2.64 13.11 26.59 216.46 378.81 974.09
bandwidth,
excluding intra-
chassis traffic
Logical servers
per chassis
1 16 10 160 36000 36000 36000
based on
architecture
The above numbers are servers per chassis, not per rack. The table illustrates the constraint
behavior within each platform, and is not intended for direct comparison between platforms.
When examining server density per rack, the chart below shows again that the virtualized
environments have a substantial advantage over the discrete servers. The largest System z density
calculated by the model was 262 logical servers per mainframe. A recent published conference
presentation8 by a company using System z and Linux for server consolidation reports that the
8
The presentations were made at SHARE 107, Baltimore, Maryland, August 14-18, 2006, by a company
located in the midwestern United States. The presentations are available from http://www.linuxvm.org/.
company is running over 300 servers on one mainframe, and Sine Nomine Associates has
customers running systems of this scale and larger.
The non-virtualized servers, of course, achieve a density that exactly equals the number of
physical units that will fit into each rack, allowing for infrastructure components and independent
of the load profile. The density of the virtualized platforms depends on the aggregate load that
limits how many logical servers will fit on each physical server.
The initial cost of acquisition of the System z mainframe is somewhat higher per logical server
than the cost of discrete or blade servers, providing a total cost of ownership (TCO) that is higher
in the first year but which increases more slowly than that of the other alternatives over time. For
each additional year, discrete and blade servers consume more power and cooling cost, which
drives up their TCO until it eventually passes that of the mainframe. The following table shows
the TCO for each of the platform scenarios.
The figures cited here are for reference only and do not imply endorsement of this paper by SHARE or by
the company providing the presentations.
Thousands
$0 $2 $4 $6 $8 $10 $12 $14
Figure 3: TCO per logical server for one, three, and five years
Notice in Figure 3 that the System z TCO for one, three, and five years appears almost invariant,
whereas there is a steady increase for all of the microcomputer systems. Only the fully virtualized
PowerEdge 1955 blade system approaches the low incremental TCO of the mainframe.
One interesting observation from the model is that the TCO per logical server on the System z is
not significantly affected by the size of the mainframe, provided the mainframe is appropriately
sized to the number of virtual servers being deployed. Customers planning System z
consolidation projects should, therefore, consider acquiring only the needed resources initially,
and leveraging the System z's ability to be nondisruptively field-upgraded with additional
capacity to meet growth requirements, ensuring that the available and needed resource levels are
well matched and minimizing long-term TCO.
The System z is highly optimized for virtualized operation, and therefore is not analyzed as a
discrete server. However, the analysis of virtualized versus non-virtualized operation on the
discrete and blade servers actually shows a slight increase in TCO for the virtualized
environments, due to hypervisor software costs that more than offset the power and cooling
savings. On the other hand, virtualization with appropriate management tools can significantly
reduce administrative costs, which is a factor not reflected in this model.
It is instructive to break out the fifth-year incremental TCO per logical server, as shown in the
following chart, to illustrate the annual cost savings of the System z once its acquisition cost has
been absorbed.
Here the System z has a clear edge, because most of the annual TCO is power and cooling
expenses. The System z also benefits from the fact that z/VM support is a fixed annual price
regardless of the number of systems being supported, so its unit cost decreases as the number of
supported servers increases.
Within the three System z scenarios, there is a clear trend showing the lowest incremental TCO
for the largest system with the most logical servers. The System z has a power consumption that
is not strongly dependent on the number of processor engines enabled, and a more powerful
configuration consumes only modestly more power than an entry-level configuration for the same
system.
This situation explains, to a large extent, why the model imposes a "notional limit" of eight
mainframes in the entire data center. In a practical situation, customers will deploy a few larger
mainframes rather than proliferating small- to mid-sized configurations in quantity, in order to
maximize the savings reflected in Figure 4 as well as saving labor costs and physical space.
Power cost per logical server is another area of considerable difference among platforms, and the
difference in power consumption accounts for much of the difference in incremental TCO.
Recall that the energy utilization (and therefore cost) in this model includes not only the server
itself, but also any associated infrastructure components and the air conditioning equipment to
remove all of the heat. Typically, every watt of power consumed by electronic equipment requires
an additional 0.7 watts of air conditioning power, according to Rob Snevely's book, Enterprise
Data Center Design and Methodology.
Figure 5 compares the total power consumption per rack location unit, accounting for air
conditioning energy as well as energy used by the electronics. With the data center's limitation of
125 watts per square foot,9 the data center is nominally able to supply just under 4200 watts per
rack location unit.10 Thus, the data center is not capable of being fully loaded with servers even
after factoring in lost rack space due to IT yield. This requires a further derating of the data
center, which is accounted for in the model as a limit on the data center's maximum total server
count.
After that derating is applied, the model calculates the maximum total logical servers for the
entire data center; the results are presented in Table 1. Note that the System z values are limited
by the "notional" constraint of eight mainframes per data center, as described in Section 3.3 and
Appendix B. It would actually be possible to deploy several times this number of servers, but
most companies doing so would purchase a larger System z than the ones considered in this
model, rather than replicating more than eight of these units.
9
Published sources often cite 100 W/sq-ft, but this is bumped up here to give the “benefit of the doubt” and
assumes a high-quality physical facility.
10
Data center designers derate rack density from the theoretical concept of racks butted side-to-side, to
account for air ducts, power distribution and UPS units, operator consoles, building support columns, etc.
The resulting "IT yield" varies depending on data center design, but a recent article by the Uptime Institute
cites a figure of 18% yield, which is used in this model.
As shown in Figure 6, the annual power cost per logical server varies by orders of magnitude,
with the traditional rack servers in non-virtualized mode faring the worst. For rack servers,
virtualization reduces power per logical server by about 61%, and on blade servers virtualization
reduces power per logical server by about 62%. The blade servers benefit slightly from the
improved performance of virtualized networks within a physical server versus even high-speed
physical networks between servers.
$0 $100 $200 $300 $400 $500 $600 $700 $800 $900 $1,000
Because the System z mainframe's power consumption is largely independent of its processor
count, the high-end mainframe's power consumption per logical server is almost 75% lower than
that of the entry-level mainframe. Note, however, that all of the System z results are substantially
lower (at least 38%) than the best of the microcomputers.
4 Conclusions
4.1 Power and Cooling
Two conclusions leap from the analysis in this model, with regard to power and cooling:
• Virtualization of underutilized servers, on any platform, will yield power and cooling cost
savings that are substantial.
• Among virtualization solutions, the System z offers greater potential power and cooling
benefits than any other available platform.
It is important to consider these conclusions together, rather than in isolation from one another.
This model shows that mainframe consolidation tends to be constrained by available CPU
resources, a conclusion that mirrors the practical experience of Sine Nomine Associates and its
customers. However, in applications that are well suited to its architectural features, the System z
offers spectacular power savings, reducing power and cooling costs to a negligible portion of total
and recurring costs.
The model shows that the total cost of ownership (TCO) per logical server is very competitive
between the mainframe and the microcomputer configurations. While in the model the
mainframes do not have a TCO advantage, it should be noted that this model intentionally omits
(as out-of-scope) a number of "soft" factors that will affect the actual TCO at a customer site,
specifically:
• Managing virtual servers typically requires significantly less labor than managing an
equivalent number of physical servers. This is true on both microcomputer and
mainframe platforms, but the management tools on z/VM are more mature and plentiful
than the corresponding tools on the microcomputer platform. This is intended as no
disrespect to VMware, but simply reflects the fact that the mainframe environment has
had several decades to perfect virtualized management tools.
• Power distribution units (PDUs), computer room air conditioners (CRACs), and the
network and SAN infrastructure components all require periodic maintenance and
occasional repairs, none of which are free. The mainframe requires significantly less of
each of these resources, and thus one expects that the maintenance and repair costs would
decrease proportionally. Even in an existing facility with pre-existing power and cooling
systems, one could argue that running these systems at a lower-than-maximum load level
would prolong their useful life. It is certain that doing so provides greater capacity
headroom to handle temporary overloads from other sources.
Based on the results of this model, and combined with practical experience at dozens of customer
installations, Sine Nomine Associates offers the following recommendations for customers
seeking to maximize the capability of existing data centers that are constrained by power, cooling,
or space limitations:
1. Identify servers that are unitized for functional partitioning, security, or political
reasons, rather than because of their resource requirements, and consolidate them
onto virtualized platforms. Virtualization is absolutely the best way to recapture
underutilized hardware resources, and to reduce the number of physical servers that must
be purchased, installed, powered, cooled, and maintained. In addition, a properly
managed virtual data center reduces labor costs for deploying new servers and upgrading
existing servers, because of the ability to "clone" prebuilt virtual servers rapidly.
2. Use the System z mainframe as a part of the total virtualization environment. Many
applications are constrained by I/O rather than CPU resources, and many commodity
services such as DNS, directory services, file and print sharing, and email can be
consolidated onto a single mainframe running virtualized Linux with little impact on
clients, even if the clients are using Windows as their operating system. In addition,
business applications written in Java exhibit excellent portability to both Linux and
mainframe Linux platforms, and are excellent candidates for consolidation onto the
System z.
6. Manage and monitor virtualized resources at the hypervisor level rather than
within the guest operating system. An operating system designed to run on discrete
systems will not be aware of the capabilities of an underlying virtualization platform, and
will not allocate, schedule, and prioritize physical resources as well as the hypervisor can.
This argues for horizontal scaling, in which application functionality is divided among
virtual servers of a smaller size with the hypervisor sharing resources between them.
The model demonstrates how power and cooling, even over physical space, limit the capacity of
data centers with modern high-performance systems. Virtualization, regardless of the underlying
physical platform, provides significant reduction in power per server, allowing many more
servers to be deployed.
Among virtualized environments, the IBM System z offers by far the best power performance.
When deployed with workloads that are well suited to its architecture -- which includes many of
the most common types of servers found in enterprise data centers -- the System z reduces the
impact of energy costs.
Ultimately, a data center's best configuration will involve selective deployment of multiple
platforms, utilizing microcomputers for the most compute-intensive loads and mainframes for
loads that are network and disk constrained. This synergy of technical strengths can extend the
life and capabilities of legacy data centers, overcoming power, space, and cooling constraints to
support growth of the enterprise.
Global Constants
Physical dimensions
Infrastructure assumptions
Performance comparison
assumptions
Occupancy assumptions
Costing assumptions
Electricity per kWH per month Computed from the US DoE data
(30 days) $67.39 US$/kW/month above
Platform Parameters
IBM System z
Dell PowerEdge Dell PowerEdge Dell PowerEdge Dell PowerEdge IBM System z BC S07 IBM System z
Parameters 1950 1950 virtualized 1955 1955 virtualized BC S07 Small Expanded EC S18
1U rack server Blade server Mainframe Mainframe Mainframe
optimized for optimized for optimized for optimized for optimized for
Category 1U rack server virtual servers Blade server virtual servers virtual servers virtual servers virtual servers
Intel Xeon 5140 Intel Xeon 5140 Intel Xeon 5140 Intel Xeon 5140 IBM Multi-Chip IBM Multi-Chip IBM Multi-Chip
dual-core 2.33 dual-core 2.33 dual-core 2.33 dual-core 2.33 Module (MCM) Module (MCM) Module (MCM)
Processor GHz GHz GHz GHz “IFL” engine “IFL” engine “IFL” engine
Configured
processors per
server 1 2 1 2 4 7 18
Configured memory
per server 4 32 4 32 16 32 96
Size of chassis (rack
units) 1 1 7 7 42 42 42
Maximum servers
per chassis 1 16 10 160 36000 36000 36000
Network ports per
chassis (not allowing
for redundancy) 1 1 2 2 1 1 1
Fibre Channel ports
per chassis (not
allowing for
redundancy) 1 1 2 2 1 1 1
Power consumption
at specified config 670 670 4000 4000 5400 5400 6300
IBM System z
Dell PowerEdge Dell PowerEdge Dell PowerEdge Dell PowerEdge IBM System z BC S07 IBM System z
Parameters 1950 1950 virtualized 1955 1955 virtualized BC S07 Small Expanded EC S18
Est. hardware cost
per chassis at
specified config $6,082 $20,421 $53,150 $196,340 $486,000 $899,000 $2,456,000
Est. hypervisor cost
per chassis at
specified config $0 $3,750 $0 $37,500 $90,000 $157,500 $405,000
Computed
Parameters
Maximum chassis
per rack (theoretical,
allowing only
minimum possible
overhead) 34 34 4 4 1 1 1
Theoretical servers
per rack (not
allowing for
overhead) 34 544 40 640 36000 36000 36000
Fibre Channel
switches per rack
(redundancy
included) 6 6 2 2 0 0 0
Ethernet switches
per rack (redundancy
included) 2 2 2 2 0 0 0
KVM switches per
rack 1 1 1 1 0 0 0
IBM System z
Dell PowerEdge Dell PowerEdge Dell PowerEdge Dell PowerEdge IBM System z BC S07 IBM System z
Parameters 1950 1950 virtualized 1955 1955 virtualized BC S07 Small Expanded EC S18
Total overhead units
per rack 14 14 10 10 0 0 0
Actual net chassis
per rack 28 28 4 4 1 1 1
Max servers per rack
at best-case virtual
density (where
applicable) 28 448 40 640 36000 36000 36000
Est. infrastructure
cost per rack for
SAN, network, and
KVM $68,256 $68,256 $36,756 $36,756 $0 $0 $0
Estimated equivalent
z9 MIPS per physical
processor 388 388 388 388 355 355 355
Estimated equivalent
z9 MIPS per physical
server 388 776 388 776 1420 2485 6390
Estimated equivalent
z9 MIPS per chassis 388 776 3880 7760 1420 2485 6390
Estimated equivalent
z9 MIPS per rack 10864 21728 15520 31040 1420 2485 6390
IBM System z
Dell PowerEdge Dell PowerEdge Dell PowerEdge Dell PowerEdge IBM System z BC S07 IBM System z
Parameters 1950 1950 virtualized 1955 1955 virtualized BC S07 Small Expanded EC S18
Infrastructure power
per rack 1500 1500 1020 1020 0 0 0
Server power per
rack 18760 18760 16000 16000 5400 5400 6300
Total power per rack 20260 20260 17020 17020 5400 5400 6300
Demanded power vs.
available limit 484% 484% 406% 406% 129% 129% 150%
Max number of Max number of Max number of Max number of Max number of
servers servers servers servers servers
represents a represents a represents a represents a represents a
reasonable reasonable reasonable reasonable reasonable
theoretical value, theoretical value, theoretical value, theoretical value, theoretical value,
which will be which will be which will be which will be which will be
further limited by further limited by further limited by further limited by further limited by
other calculations other calculations other calculations other calculations other calculations
Configuration Notes in the model in the model in the model in the model in the model
Processor Processor Processor Processor
selected for selected for selected for selected for
across-the-board across-the-board across-the-board across-the-board
availability, for availability, for availability, for availability, for
easier easier easier easier
comparison comparison comparison comparison
IBM System z
Dell PowerEdge Dell PowerEdge Dell PowerEdge Dell PowerEdge IBM System z BC S07 IBM System z
Parameters 1950 1950 virtualized 1955 1955 virtualized BC S07 Small Expanded EC S18
IBM does not IBM does not IBM does not
publish prices on publish prices on publish prices on
mainframes; the mainframes; the mainframes; the
pricing is based pricing is based pricing is based
Priced at Priced at on an estimate- on an estimate- on an estimate-
dell.com and by dell.com and by driven model driven model driven model
Priced online at Priced online at phone to Dell phone to Dell (see Globals for (see Globals for (see Globals for
Pricing Notes dell.com dell.com pre-sales pre-sales constants) constants) constants)
Some values Some values Some values
obtained from obtained from obtained from
QLogic 2462 QLogic 2462 QLogic 2462 QLogic 2462 SNA case SNA case SNA case
dual-channel FC dual-channel FC dual-channel FC dual-channel FC studies; details studies; details studies; details
HBA for SAN HBA for SAN HBA for SAN HBA for SAN not publishable not publishable not publishable
access access access access due to NDA. due to NDA. due to NDA.
Dual-channel Dual-channel Dual-channel Dual-channel
Ethernet included Ethernet included Ethernet included Ethernet included
in base config in base config in base config in base config
Doubly- Doubly-
redundant (2+2) redundant (2+2)
Added redundant Added redundant power supplies power supplies
P/S P/S standard standard
Minimum Minimum Minimum Minimum
onboard DASD onboard DASD onboard DASD onboard DASD
(36 GB, no RAID) (36 GB, no RAID) (36 GB, no RAID) (36 GB, no RAID)
selected selected selected selected
Removed Dell Removed Dell Dell remote Dell remote
remote remote management management
management management interface not interface not
interface interface used used
IBM System z
Dell PowerEdge Dell PowerEdge Dell PowerEdge Dell PowerEdge IBM System z BC S07 IBM System z
Parameters 1950 1950 virtualized 1955 1955 virtualized BC S07 Small Expanded EC S18
Enclosure price Enclosure price
estimated at estimated at
$5000, would be $5000, would be
somewhat lower somewhat lower
if purchased in if purchased in
quantity quantity
Price above Price above
reflects 10 reflects 10
identical blades, identical blades,
but no quantity but no quantity
discount is discount is
reflected here reflected here
Workload Characteristics
Management
Workload scenario Static web Database DNS or LDAP e-Commerce Business and monitoring E-mail
assumptions server server server web server File server logic server server server
CPU Network and
Net andDASD I/O and Net and DASDNet andintensive, DASD I/O withNet and
Characteristics DASD I/O moderate CPU Low utilization I/O, crypto DASD I/O moderate I/O moderate CPU DASD I/O
Frequency of
occurrence (unitless) 30 4 8 4 40 4 4 20
Application CPU load
factor (unitless) 0.05 0.25 0.03 0.1 0.02 0.3 0.05 0.03
Working storage set
(megabytes) 256 1536 128 512 128 1024 256 256
Management
Workload scenario Static web Database DNS or LDAP e-Commerce Business and monitoring E-mail
assumptions server server server web server File server logic server server server
Relative DASD load
factor (unitless) 0.5 0.9 0.05 0.2 0.9 0.4 0.6 0.8
Relative crypto load
factor (unitless) 0.02 0.02 0 0.2 0.1 0.02 0.1 0.05
Relative bandwidth
factor (unitless) 1 0.75 0.1 0.5 1 0.6 0.75 0.85
Kernel CPU load factor
for I/O (unitless) 0.075 0.0825 0.0075 0.035 0.095 0.05 0.0675 0.0825
Ratio of net I/O that is
server-to-server 0 0.9 0.1 0.5 0 0.98 0.98 0.02
The aggregated workload values are reported in Table 1on page 14.
Scenario Results
Dell Dell
Dell PowerEdge Dell PowerEdge IBM System z
PowerEdge 1950 PowerEdge 1955 IBM System z BC S07 IBM System z
1950 virtualized 1955 virtualized BC S07 Small Expanded EC S18
Platform Summary
Actual net chassis per rack 28 28 4 4 1 1 1
Max servers per rack at best-case
virtual density (where applicable) 28 448 40 640 36000 36000 36000
Dell Dell
Dell PowerEdge Dell PowerEdge IBM System z
PowerEdge 1950 PowerEdge 1955 IBM System z BC S07 IBM System z
1950 virtualized 1955 virtualized BC S07 Small Expanded EC S18
Estimated equivalent z9 MIPS per
chassis 388 776 3880 7760 1420 2485 6390
Total power per rack 20260 20260 17020 17020 5400 5400 6300
Demanded power vs. available
limit 484% 484% 406% 406% 129% 129% 150%
Allowable racks in data center
based on design limit 49 49 58 58 184 184 158
Allowable racks in data center,
limited by “notional limit” 49 49 58 58 8 8 8
Platform Adjustments
RAM overcommit factor 1 1 1 1 2 2 2
Offloaded DASD support factor 0.5 0.5 0.5 0.5 0.95 0.95 0.95
Offloaded crypto support factor 0 0 0 0 0.95 0.95 0.95
Offloaded net support factor 0 0 0 0 0.85 0.85 0.85
CPU I/O handling offload factor 0 0 0 0 0.9 0.9 0.9
% of net I/O that is inside chassis 0% 12% 12% 13% 13% 13% 13%
Workload Calculations
Logical servers per chassis based
on CPU load 5.273247496 10.54649499 52.73247496 105.4649499 19.29899857 33.7732475 86.84549356
Logical servers per chassis based
on RAM capacity 14.47619048 115.8095238 14.47619048 115.8095238 57.9047619 115.8095238 347.4285714
Dell Dell
Dell PowerEdge Dell PowerEdge IBM System z
PowerEdge 1950 PowerEdge 1955 IBM System z BC S07 IBM System z
1950 virtualized 1955 virtualized BC S07 Small Expanded EC S18
Logical servers per chassis based
on DASD bandwidth 1.502666833 3.005333666 15.02666833 30.05333666 5.49945078 9.624038864 24.74752851
Logical servers per chassis based
on crypto load 16.36525085 32.7305017 163.6525085 327.305017 59.89344384 104.8135267 269.5204973
Logical servers per chassis based
on network bandwidth 1.159899653 2.319799306 11.59899653 23.19799306 28.29995717 49.52492505 127.3498073
Dell Dell
Dell PowerEdge Dell PowerEdge IBM System z
PowerEdge 1950 PowerEdge 1955 IBM System z BC S07 IBM System z
1950 virtualized 1955 virtualized BC S07 Small Expanded EC S18
Logical servers per rack,
accounting for all limitations 28 73 40 106 58 101 262
Logical servers in data center,
accounting for all limitations 1372 3577 2320 6148 464 808 2096
Cost Calculations
Est. hardware cost per chassis at
specified config $6,082 $20,421 $53,150 $196,340 $486,000 $899,000 $2,456,000
Est. hypervisor cost per chassis at
specified config $0 $3,750 $0 $37,500 $90,000 $157,500 $405,000
Est. infrastructure cost per rack $68,256 $68,256 $36,756 $36,756 $0 $0 $0
Acquisition cost per rack $238,552 $745,044 $249,356 $972,116 $576,000 $1,056,500 $2,861,000
Total acquisition cost $11,689,048 $36,507,156 $14,462,648 $56,382,728 $4,608,000 $8,452,000 $22,888,000
Dell Dell
Dell PowerEdge Dell PowerEdge IBM System z
PowerEdge 1950 PowerEdge 1955 IBM System z BC S07 IBM System z
1950 virtualized 1955 virtualized BC S07 Small Expanded EC S18
First year TCO $12,952,244 $37,781,852 $15,738,783 $57,670,363 $4,678,535 $8,522,535 $22,968,240
Three-year TCO $15,478,635 $40,331,243 $18,291,053 $60,245,633 $4,819,606 $8,663,606 $23,128,720
Five-year TCO $18,005,026 $42,880,634 $20,843,323 $62,820,903 $4,960,677 $8,804,677 $23,289,199
Consider first the workload calculation for CPU utilization, which applies to all the server types.
The utilization number for each workload type is multiplied by the frequency of occurrence of
that workload, yielding a number that is in itself not meaningful for comparison but rather is an
intermediate result. These intermediate values are summed and divided by the sum of the relative
occurrence values, which normalizes the CPU utilization back to a unit-weighted average
representing one unit of the aggregated workload. Thus, it does not matter what absolute numbers
are used for the frequency of occurrence; only their proportion to one another is significant. This
process is repeated for each of the system resource calculations (RAM, I/O bandwidth, network
bandwidth, etc.).
The Platforms page shows the System z Model S07 server with 4 IFL processors and 16
gigabytes of main memory (RAM). These are constants arising from the choice of configuration.
This yields a total of 1420 estimated MIPS for the mainframe as a whole. Since the mainframes
considered here occupy essentially one rack location each, the per-chassis and per-rack numbers
are equivalent for this platform.
Note that IBM does not disclose list prices for mainframes, so the acquisition costs are partially
based on amounts paid by Sine Nomine Associates customers. Some mainframe hardware,
notably the IFL engines, is priced publicly by IBM, and in these cases those public list prices are
used.
The Platforms page also computes a ratio of power demanded per chassis versus nominal power
available per chassis, the latter being computed from the data center's watts-per-square-foot value
from the Globals page. In this model, it happens that all of the platforms exceed the nominal
power per rack location, but the amount varies considerably. The mainframe's ratio, expressed as
a percentage, is 129%, meaning that if mainframe racks were to fill the data center at maximum
physical density, the data center would be overcommitted on power by 29%. The ratio computed
here is revisited on the Scenarios page, where it is applied as a limiting factor to limit the
maximum number of racks (for each platform) that can be installed in the data center.
In the case of the mainframe, Sine Nomine Associates has applied an arbitrarily chosen "notional
limit" of 8 mainframes total per data center, based on our experience with actual customers. Thus,
it is the notional limit that constrains the number of mainframes per data center, not the power
limit. For other platforms where no notional limit is imposed, the power allocation eventually
constrains the number of racks. For the configurations considered here, there are no cases where
physical space constrains the number of servers or racks in the entire data center.
Moving on to the Scenarios page, the Platform Summary numbers are simply mirrored from the
Globals and Platforms pages, for convenience. Then a set of constraint-based computations
determines how many logical servers per physical chassis could be deployed, based on each of
the resource types (CPU, RAM, I/O, etc.) defined in the aggregated workload. The constraints are
separately calculated, and it is understood that the final constraint is based on whichever of these
resources is the worst case for any particular platform or configuration. As expected from
customer experiences known to Sine Nomine Associates, the mainframe platforms are typically
constrained by CPU resources rather than I/O, memory, or architectural limits.
Rather than using these numbers directly, it is more accurate to apply a set of platform-specific
adjustments to each constraint value, to reflect the qualitative differences between architectures.
For instance, the microcomputers benefit somewhat from offloading of disk management to the
storage area network (SAN), because the main CPU does not have to manage low-level device
interfaces for the drive but rather can treat the SAN as a virtualized resource. The mainframe,
using the same SAN technology, gains this benefit also, but in addition benefits from the fact that
the SAN host bus adapter is connected to an intelligent channel controller, providing an even
higher level of I/O offloading. Linux device drivers in the mainframe environment are built to
leverage this capability.
The disk I/O offloading also is reflected in an adjustment to memory overcommit limits, because
the mainframe offers a feature called VDISK that allows Linux to see "swap space" that operates
at nearly RAM speed, whereas on the microcomputer platform any swapping operation actually
involves disk access. Both Sine Nomine Associates experience and published sources show that
RAM on the mainframe can be overcommitted at a higher level than on a microcomputer. As it
turns out, however, the CPU constraint on the mainframe still is the limiting factor, even with this
adjustment applied.
Similar adjustments are applied to network I/O based on the intelligence of the subsystem on the
different platforms. This is calculated as a reduction in kernel processing time based on the fact
that the Linux operating system delegates much of what would be low-level device handling to
the mainframe's intelligent channel controllers.
For cryptographic processing, the platform-specific adjustment applies all of the cryptographic
workload to the main CPU on the microcomputers, but removes a fixed percentage of this
workload from the main CPU on the mainframe, because the mainframe offers multiple
cryptographic coprocessors as a standard feature. Some of the crypto workload remains on the
main CPU, however, reflecting the small but finite amount of CPU required to supervise the
cryptographic engines.
The end result of the platform-specific adjustments is to accurately model the fact that while the
mainframe has a smaller total CPU cycle count than a rack full of microcomputers, the
mainframe's CPU is utilized more efficiently and is more available (proportionally) to
application-level code rather than being consumed by low-level operating system kernel
functions.
Based on the adjusted constraints for each platform, the constraint calculations are re-executed to
yield the final list of constrained maximum logical server counts. It is no surprise that even here,
the mainframe's CPU is its worst-case constraint and limits the final server count for that
platform. For the various microcomputer platforms, other constraints dominate in some cases.
Note that the constraint selection is applied neutrally to all platforms; that is, it is not required to
be the same on all configurations, but rather is taken as the worst case for each individual
configuration.
Once the actual number of logical servers is known per chassis (and thereby per rack, since the
number of chassis per rack is already known), a final TCO calculation can be performed to
determine the total TCO for the data center, and the total TCO per logical server.
Bibliography
Listing of a source in this Bibliography does not imply that the author of that document endorses
or supports the conclusions of this study.
Kenneth Brill, "2005-2010 Heat Density Trends in Data Processing, Computer Systems, and
Telecommunications Equipment"
Turner and Seader, "Dollars per kW plus Dollars per Square Foot Are a Better Data Center Cost
Model than Dollars per Square Foot Alone"
Turner, Seader, and Brill, "Tier Classifications Define Site Infrastructure Performance"
Rob Snevely, "Enterprise Data Center Design and Methodology" (Sun Microsystems)
U.S. Dept. of Energy, national average for commercial electricity for 12 months ending October
2006, http://www.eia.doe.gov/cneaf/electricity/epm/table5_3.html (data retrieved April 4, 2007)