Professional Documents
Culture Documents
White Paper
Overview
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Confluent’s approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
About Confluent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Many vendors make claims around their software reducing total cost of ownership (TCO), or improving ROI. But how are
these claims formulated? What’s the process of measuring them? What are the assumptions? How robust are the estimates?
This white paper provides answers to these questions by outlining the business value assessment model we use, our approach,
and a customer example and shares lessons learned along the way. Our intention is to be as open and transparent as
possible in our discovery, the information we use, and the assumptions we make. We strive to gain consensus with our
customers when calculating business value because a wildly exaggerated and groundless TCO or ROI is pointless for all
involved. Or worse, it can dangerously set flawed expectations
The intended audience for this white paper includes all the stakeholders involved in making decisions around implementing
and operating Kafka in an organization. This broadly includes the following three groups:
Hands-on technology teams – This group is often tasked with creating a business case and includes DevOps, data
engineers, architects, and InfoSec
Business-tech teams – This group involves the economic buyer, including enterprise architects, VP of Engineering, BU
heads, CTO, and CIO
Pure business teams – This group typically reviews business cases and ensures appropriate prioritization and includes
LOB heads, product owners, business decision-makers, buyers, procurement, and potentially the broader C-suite
A focus on TCO, ROI, and overall cost effectiveness is particularly important in the current climate amidst COVID-19. The
global public safety measures and economic downturn have simultaneously increased the importance of focusing on the
digital side of the business while increasing pressure on the budgets available to deliver those digital services..
As an example, a TCO might outline reduced setup and operating costs of a fully managed service. Whilst this presents
business value, many organizations choose a fully managed service for reasons beyond cost take-out alone. Additional
reasons can include, for example, accelerated time to market, increased developer velocity, increased business agility and
reduced overall risk, and perhaps most importantly, it frees key resources from the operational burden of implementing and
managing a solution.
Deploying Kafka presents challenges that aren’t the core problem most companies are trying to solve. It’s not where they
want their best people focused. With a fully managed service the customer can shift key resources to higher value tasks. This
refocus results in opportunity optimization and isn’t captured in TCO alone.
In order to capture the full business value of Confluent Cloud, we look at three value buckets:
Deploy Kafka at scale Operate more efficiently with Deliver higher returns with your
within one week of lower infra cost, maintenance, project by launching faster and
starting with Confluent and downtime risk reducing operational burden
2. Our approach
Our approach to measuring business value includes a discovery exercise in which we ask questions around each of the value
buckets above. For example:
Speed to market
• How long would it take to develop the self-managed platform (including hiring / ramping resources)?
• What is the financial benefit of getting to market sooner?
• What are the planned management costs such as systems integrators, training, or other services?
• What are the indirect costs such as downtime, performance degradations, and security risks?
Return on investment
• What are the details around the business use case which the Kafka project underpins?
We often get asked to list discovery questions in advance, but this approach can skirt key value areas. While many details
above are straightforward, some areas are more nebulous, or have an indirect line to value. Calculating value requires
discussion and sometimes debate. Good discovery is a skill. It requires being inquisitive, asking the right questions, drilling into
relevant areas, backing out of others, and gaining overall consensus.
In our approach we aim to be as empirical as possible. In a sense, comparing Confluent Cloud, a fully managed service, with
a self-managed service or other option is similar to a scientific thought experiment, one that is imagined as a sequence of
events, then constructed and explained through narrative form.
2. The target state (Confluent Cloud) represents the “experimental group,” which includes the “dependent variable” –
Confluent Cloud. We want to compare the two scenarios (baseline versus target), like-for-like, except for
the single variable.
3. We assess value as the difference between the baseline and target state in terms of the three value buckets listed above
– these are our hypotheses.
4. We also assess soft, or intangible, benefits – i.e., elements of value which can be difficult to quantify. We also model
various scenarios, sensitivities, and risks which might impact the overall business case.
5. Finally, we aim to use proof points to support the value assessment assumptions and estimates. In scientific terms,
we aim to corroborate our findings with further evidence.
Of course, our overall approach cannot be fully scientific. This is an exercise in forecasting, which relies on assumptions and
estimates. There will be many potentially confounding factors outside of our control in real-world scenarios. We’re using
inductive, not deductive logic, which means we need to accept a business case for what it is – a bit of guesswork, until we
can prove the case post-implementation.
Because of this, we insist the customer own the business case, and we establish this ground rule early on in any engagement.
We can help build and execute the model, © 2014–2020 Confluent, Inc. | 5 and provide guidance and advice, but ultimately
all cost categories, estimates, and assumptions should be provided by, or ratified by, the customer.
In the next section, we provide a real-world example of how this plays out.
Deploy Kafka at scale Operate more efficiently with Deliver higher returns with your
within one week of lower infra cost, maintenance, project by launching faster and
starting with Confluent and downtime risk reducing operational burden
Reducing the time it takes to bring something to market. With self-managed Kafka, you need to do the manual sizing,
provisioning, expansion, and maintenance of a Kafka cluster. With Confluent Cloud, we do all the work for you. Clusters
are provisioned instantly and maintenance is seamlessly managed for you, and you can start streaming data the day you
sign up.
Increasing developer productivity and agility. With a fully-managed service you don’t have to spend 6-9 months hiring
people who know Kafka just to manage infrastructure or bog down your developers with Kafka maintenance. As a result
you avoid the time and expense of the hiring and ramp-up stages for new people and the slowdowns caused by moving
developers off of critical projects. Of course the project doesn’t end when the product launches. There will always be
ongoing bug fixes, improvements, and feature enhancements, which typically requires developers to invest more time
into developing new tools, new integrations, and so on. Confluent Cloud offers a suite of fully-managed tools and
connectors so your best people remain focused on your critical projects and apps that drive competitive differentiation
and revenue – not maintaining Kafka.
Project 6-9 months of hire 3 months 9-12 months to build the reproduction - grade
Kickoff Kafka resources to ramp Kafka platform and develop application
Go to market in 2 years
Go to market in 6 months
With Confluent Cloud, the two-year project can potentially be reduced to six months. You don’t need to hire a full team,
and you can get your Kafka clusters up and running in a matter of weeks. You can then focus all your energy in building
applications and delivering value from your project faster.
Deploy Kafka at scale Operate more efficiently with Deliver higher returns with your
within one week of lower infra cost, maintenance, project by launching faster and
starting with Confluent and downtime risk reducing operational burden
We first try to estimate the cost breakdown of a customer’s operating expenses. This typically includes three main components:
1. Infrastructure – this can be on premises or cloud provider bills.
2. Operational costs – in the form of full-time equivalent (FTE), including engineers maintaining and building out platform
capabilities for Kafka, e.g., SREs and DevOps.
3. Support and services – this includes third-party spend to cover things like professional services, training, and support plans, etc.
$ 950,000 $-
$ 1,500,000
$ 1,000,000
$ 400,950
$ 500,000
$ 193,134
$ 73,561
$-
Inf - Inf - Inf - Operational Outage Cots Support Subtotal
Compute Storage Network Costs (not counted) Cost (CP)
We then look at cost take-out opportunities and ratify these with the customer.
$ 73,561
$ 1,000,000
$ 680,000
$ 500,000
$ 480,000
$ 667,645
$-
$ 200,000
$- $-
In this example, we see that Confluent Cloud allows us to remove the infrastructure and support and services costs in the
self-managed solution completely and reduce the operations costs by more than 60%. The overall savings amounts to
$1,177,645 or 63% over one year.
When we model the above savings over three years, we get $3.5M+ (63%) in savings.
$ 8,000,000
$ 6,000,000
$ 4,000,000
$5,572,936
In this section we also look at risk costs, as identified by the following three categories:
1. Downtime – It takes significant investment and expertise to maintain a reliable platform with high availability. Confluent
Cloud comes with a 99.5% SLA for Basic clusters and a 99.95% uptime SLA for Standard and Dedicated clusters built in,
with no disruptions for upgrades and planned maintenance.
2. Performance degradation – If your organization experiences any downtime, it may result in significant performance
degradation, which can lead to prolonged periods of lost productivity. Confluent’s 24/7 expert-led support helps prevent
issues and enables fast recovery.
3. Security infringement – Your data is obviously important. It can be costly and risky to manage it all by yourself. Confluent
Cloud comes with enterprise-grade security and compliance to make sure your data is always safe.
The impact of risks, such as downtime, performance degradation, and security infringements, should not be understated.
Various studies have estimated that an hour of downtime costs an organization $100K on average. And a recent study by IBM
found that the average data breach costs an organization $3 million to 4 million, from detection and resolution costs, fines
and reparations, and reputational harm to the business.
We can model these risks costs, however, many organizations choose not to quantify these directly. In general, we take
direction from the customer in terms of what they’re comfortable modeling.
This is also the case when we review the overall ROI. We work with the customer to model business value or cost
effectiveness for their organization.
In addition, there is a strong argument to continue the business case exercise in the form of ongoing benefits realization, or
to complete a “sanity check” part way through a program. Studies by McKinsey, Everest Group, and IDC suggest three out of
four digital transformation projects fail to deliver on their stated business value. By incrementally assessing value and costs, a
project can pause, or pivot, if it is not hitting business goals.
As a vendor that mostly relies on a subscription model, we encourage ongoing scrutiny in costs and value. Why? It makes
good business sense. Gartner found that nearly two-thirds of tech buyers would purchase more from existing providers if
they see value from their investments being clearly demonstrated.
For the TCO, at a minimum, we require information on development cycles / timeline, infrastructure costs (compute, storage,
and network), developer FTEs and DevOps, or operational FTEs (number and fully loaded costs), and support and services
details. Additional information may include: the positive impact of business agility and the negative impact (associated
costs), security infringements, performance degradation, and downtime.
In terms of accuracy, cash flows can be “NPV’d,” which means discounted to “net present value.” This adjusts the dollar
numbers to compare over time (i.e., cash today is worth more than cash tomorrow). NPV adjustments are obviously more
applicable to five-year business cases, rather than three-year business cases.
Since we mostly run three-year models – and accept accuracy is often less than 10% – we tend to simplify our models to
exclude NPV adjustments. While we acknowledge this approach – and sometimes use it for financial services customers – we
mostly avoid it to keep things simple.
We also acknowledge that customer needs vary. We tend to weigh a business case depending on what’s important to a
customer. Some will have higher-level strategic priorities around cost savings over risk costs, for example.
Credibility
We aim to maximize the credibility of the business case in the following three ways:
1. Being conservative, aligning with a theme of under promising and over delivering.
3. Clearly stating and ratifying assumptions and estimates, with the option to easily adjust these at any time in the process.
Initially, customers tend to be shy about sharing actual numbers. We recognize that we first need to earn their trust and
sometimes start by using placeholder numbers (estimates), in an effort to illustrate what the business case might look like. If
the customer likes the insight we provide, they’re often more inclined to share actual figures.
In addition to charts, a story requires a good narrative, which helps to explain the overall value proposition when discussing
and presenting the business case – especially to a nontechnical audience. As illustrated in the image below from
a Forbes article on data storytelling:
• Visuals (charts) help engage people
• Data helps convince people
• Narratives help take people along the journey
Engage
Explain Enlighten
Data Data
Data
We know decisions are often based on emotion as well as logic. So, emotion also needs to be factored into the narrative.
An example of indirect value leans on the analogy of paying for insurance. Much like insurance, when you invest in security,
you hope to never have to really use it. We can clearly articulate the costs associated with security, but what is the value?
Is it cost effective? We only really know if something goes wrong. Otherwise, how can we quantify the business value
of a security infringement not happening?
The process also includes completing discovery, and agreeing on the assumptions and estimates. The collaboration includes
the communicating and sharing of information, where the vendor gets to understand both the customer’s challenges and
opportunity and the customer gets to fully understand the vendor’s value proposition.
Helping our customers complete ROI and TCO models and demonstrating how we deliver business value helps our customers
make more informed decisions and strengthens our long-term partnership. If you’d like to learn more or discuss our approach
to business value and event streaming assessments, please contact bvc@confluent.io or visit http://cnfl.io/kafka-tco.
About Confluent
Confluent, founded by the original creators of Apache Kafka®, pioneered the enterprise-ready event streaming platform.
With Confluent, organizations benefit from the first event streaming platform built for the enterprise with the ease of
use, scalability, security and flexibility required by the most discerning global companies to run their business in real time.
Companies leading their respective industries have realized success with this new platform paradigm to transform their
architectures to streaming from batch processing, spanning on-premises and multi-cloud environments. Confluent is
headquartered in Mountain View and London with offices globally. To learn more, please visit www.confluent.io. Download
Confluent Platform and Confluent Cloud at:
www.confluent.io/download