You are on page 1of 4


From DevOps to DataOps:

Building real-time data pipelines
to power customer recommendations
and personalized content

The Opportunity
Sky is a global entertainment and communications provider
serving more than 22 million households across the United
Kingdom, Italy, and Germany.

At a time when traditional viewing habits have been upended by online content
providers and “over-the-top” access to content of all kinds, Sky wanted to
ensure that its extensive content catalogs, which are localized across European
markets, were as visible and accessible as possible to its users as those of its
flashiest competitors.

Sky had plenty of data on viewer habits flooding in, and the technology team
was eager to supplement their traditional data warehousing and reporting
process by rebuilding data ownership at the platform level. In Sky’s case, that
meant real-time data analysis tools that promised faster feedback loops,
processing, and machine learning with powerful personalization. The big idea:
to make better use of all the available data to improve its customer engagement
across all its content delivery platforms.

In order for data-driven personalization to become a core component of the Sky

technology platform, the data team required strategic buy-in from stakeholders
across the business. The engineering team was ahead of the curve but well-
positioned to alleviate pain points in business analytics and improve customer
retention with real-time personalization. With a plan in place, it was time
to evolve beyond historically slow batch reporting and remove the barriers
between the people who had the data and the people who could use it to
improve customer engagement.

Building real-time data pipelines to power customer recommendations and content

Unique requirements The Solution
Sky needed to move forward from its legacy
database-driven integration, in which it extracted reports
from data warehouses and relied on third-party partners
to manage data and generate reports. The company’s IT
leaders looked to Mesosphere to help them navigate a
digital transformation toward hybrid cloud, an evolution
that would help it ship more apps faster and capture
actionable data insights using real-time data and machine
learning/AI. By modernizing its enterprise architecture,
it could save money through automated operations and
higher resource utilization -- all while gaining the flexibility
of a hybrid cloud deployment.

Sky also required a cloud provider-agnostic solution,

so that it could rapidly expand into new international
territories. Depending on the region, Sky may be required
to follow certain data locality laws or want to optimize for Sky’s new state-of-the-art architecture and platform for
a particular regional datacenter. With DC/OS, the team real-time data processing and machine learning adopts an
has complete freedom to select the best provider for the open architecture to support multiple applications. One of
circumstance without sacrificing precious engineering the primary reasons Sky turned to DC/OS was to simplify
resources, as they can use DC/OS to create a datacenter- and automate operations of Apache Spark by lining up
cloud as a single computer with workload pooling and compute resources close to the data sources. Additionally,
density optimization, complete lifecycle management of Mesosphere DC/OS helps accelerate development with
platform services, and portability between clouds, whether its catalog of 100+ platform services including container
those clouds are enabled by Amazon Web Services, orchestration, data services, and machine learning tools,
Google, or any other provider. any of which can be added to the stack with a simple click.

Ultimately, Sky was attracted to Mesosphere DC/OS

for its ability to enable the best-of-breed open source
“ Mesosphere DC/OS enabled us to create
technologies and commitment to sustainable innovation. a data platform that allowed us to take
It was important to Sky that they avoid getting locked into ownership of our data and transform
one ecosystem or vendor. it into powerful customer features that
keep them coming back to Sky for their
entertainment choices.”
Hubert Behaghel
Head of Technology

Building real-time data pipelines to power customer recommendations and content

Sky chose a SMACK stack architecture, as
the combination of Spark, Mesos, Akka, Cassandra, and
Kafka is known, because it is fault-tolerant and makes it How does Sky’s new data platform impact the customer
easy to spawn analytics across a distributed environment. experience? By creating personalized experiences,
Sky’s real-time processing includes Apache Kafka for making content visible, and doing so quickly. In the past,
event streaming, Spark Streaming for event processing, customers who signed up for premium content packages
and Cassandra to run a personalized metrics database. such as sports or movies could expect to wait up to
Data is further analyzed by Python, Scala, and R, and three days for that new content to appear on their home
data visualization is provided by Superset, Zeppelin, and pages. With the new DC/OS-based data pipeline in place,
Qlik. A unified set of APIs ultimately yields customer new content appears in less than a minute, reassuring
segmentation, personalized recommendations, and taste customers that they are indeed getting what they just paid
profiles that help improve engagement. This design now for. That improved experience is sure to keep customers
includes a beta implementation of a shared data lake to: coming back and engaging with additional content. What’s
coming soon: a deep learning recommendation engine
• Support a common global tagging framework; that continuously learns about habits and tastes across
segmentations and where A/B testing is a first-class
• Unify the data flow across all propositions;
citizen. Also on it’s way are patented innovations that
• Capture and process data in real time with optional fundamentally improve how to apply machine-learning to
batch processing; and real-time personalization.
• Use best-of-breed open-source tools.
“ With Mesosphere DC/OS, we’ve evolved
Once the IT team achieved the necessary buy-in from from DevOps to DataOps: building
the business side, it was able to begin to empower automated workflows to ship new
data source owners to be great stewards for the group,
features and connect data pipelines
achieve maximal awareness of what Sky data is and how
to analyze it, and enforce a very close feedback loop by
that move the needle for our customers
means of A/B testing and real-time fast data processing. and our business. DC/OS allows us the
freedom and flexibility to choose the
right infrastructure and the open source
technology we need to get the job done.”
Hubert Behaghel
Head of Technology

Building real-time data pipelines to power customer recommendations and content

DC/OS has breathed new life into Sky’s DevOps culture,
offering a self-service platform that, among other things,
gives developers one-click access to more than 100 cloud- Ready to see how Mesosphere
native services, including the ability to explore Kubernetes DC/OS can power data analytics
on DC/OS, which automates much of the backend in your organization?
configuration to ensure high availability, built-in monitoring Contact today!
and metrics, and load balancing. Sky can lower costs by
From weekly touch-base meetings to biweekly
running all its data services and containers on shared
roadmap calls, customer success managers
resources, maximizing utilization across its corporate
and solution architects work lockstep with
datacenter-as-a-cloud, a setup that lets it, among other your technology organization to eliminate
things, run Kubernetes alongside other data services the learning curve.
and traditional apps, all on the same cluster. Sky is also
considering adding multi-cloud capabilities leveraging
Google Cloud Platform to create a data lake and enable an
even greater degree of business intelligence.

Given the speed with which consumers’ entertainment

habits are changing, Sky knew that it needed speed: speed
in development, speed in data analysis, and speed in
reaching its vast customer base on a personal level. Its
new DC/OS development platform delivers that speed.
“What should I watch next?” Now Sky has the answer.

Learn More
Mesosphere is leading the enterprise transformation
toward distributed computing and hybrid cloud portability.
Mesosphere DC/OS is the premier platform for building,
deploying, and elastically scaling modern, containerized
applications and big data without compromise.
DC/OS makes running containers, data services,
and microservices easy, across any combination of
infrastructure — datacenter, cloud, or edge —
without lock-in.

Building real-time data pipelines to power customer recommendations and content