You are on page 1of 30

The Data Leader’s

Guide to Modern
Analytics
A step-by-step process to faster insights,
democratizing data and data-driven innovation
Table of Contents

Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

The hierarchy of data needs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Signs you should start modernizing your analytics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

How you know you've succeeded. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Step 1: Setting up a modern data stack. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Step 2: Grow your data team and establish data governance standards . . . . . . . . . 17

Step 3: Apply product thinking and promote data literacy . . . . . . . . . . . . . . . . . . . . . . 21

Step 4: Build a robust data architecture (and beyond). . . . . . . . . . . . . . . . . . . . . . . . . . 25

Cautionary notes and pitfalls to avoid. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

The Data Leader’s Guide to Modern Analytics 2


Introduction

We live in a world saturated with data. The three Vs of big data – variety, volume and velocity
– continue to grow as business operations increasingly take place in and depend on the cloud.
In a highly competitive, globalized and interconnected world, it is increasingly essential for
organizations to be able to turn data into opportunities.

Broadly speaking, organizations can use insights from data to improve the following:

1. Customer experiences
2. Internal processes and operations
3. Products and features

A modern data operation can make valuable insights available to everyone in your organization.
A data-literate leadership means data-driven improvements become important keystones of
your strategy and set the high-level direction for your organization. For individual contributors,
broad-based data literacy means an opportunity to personally make impactful decisions.

The Data Leader’s Guide to Modern Analytics 3


The hierarchy of data needs
Building a modern data operation is a cumulative process. It requires a strong technological and
organizational foundation to support data use cases of increasing sophistication. Each level
in this success sequence can be described by a desired outcome and the actions required to
achieve it.

Level 1: Data extraction and loading


The most foundational level of the hierarchy is the capability to extract, load and transform
data. This is often referred to as data integration and is enabled by a modern data stack.

The modern data stack consists of the following technologies:


• Data pipeline – The data pipeline is a data integration tool used to connect data sources with
a destination.
• Destination – Typically a data warehouse, the destination serves as a central repository of
record or “single source of truth,” from which analysts can build data models.

The Data Leader’s Guide to Modern Analytics 4


• Transformation tool – A transformation tool is used to construct data models from raw
data. Ideally, it interfaces directly with the data warehouse, enabling data models to be
created and stored in the data warehouse environment.
• Business intelligence platform – A business intelligence platform is essential for making data
legible to end users in the form of reports and dashboards.

Data warehouse vs data lake


Data warehouses are relational databases designed for analytical queries. Analytical queries
typically involve combining and summarizing values from a large number of records. Data
warehouses are not to be confused with transactional or operational databases, which are
optimized to frequently query individual records. For structured (i.e. tabular, organized)
enterprise data, data warehouses are typically the most practical solution for storing and
accessing analytical data.

Data lakes are a more general, usually cheaper data storage solution that stores all data as
files. As a result, data lakes can accommodate both unstructured and structured data but run a
danger of becoming cluttered, poorly documented and “murky,” i.e. data swamps.

For the purposes of this guide, we assume that data warehouses are the destination. Very
complex use cases, such as those involving machine learning applications that are trained using
documents and media files, may combine the functionality of data lakes and data warehouses.

The Data Leader’s Guide to Modern Analytics 5


The purpose of technology is to save labor. The technologies included in the modern data stack
can often be trialed quickly at a modest cost. Trying new technology is far simpler and cheaper
than hiring new people and offers you the ability to quickly demonstrate the benefits of a new
solution.

Without the help of modern off-the-shelf solutions, the conventional approach to data
operations is engineering-heavy and involves roles with highly specialized competencies that are
difficult to fill. Off-the-shelf tools allow you to circumvent the cost and complication of building
a data integration platform from scratch.

“Working with terabytes of data, making sense of it and putting in the correct
context is one of the most important things we can do. This allows employees
from across the organization to create valuable insights into how we work,
engage with partners and deliver services across our customer base. Every
decision we make is driven by data.”
– Lucianne Millan, Senior Manager of Data Engineering, WeWork

Level 2: Data modeling and transformation


Once the technological tools are in place, your analysts will be able to begin massaging the raw
data in your destination into data models that can be turned into visualizations, dashboards and
reports. This requires a deep understanding of what the data actually represents and segues into
the next point, which is the need to establish strong data governance standards.

Data governance is a wide-reaching term that includes ensuring data quality and integrity,
providing a taxonomy to define key business metrics and standardizing metrics and practices
across different business units. It also concerns properly assigning permissions to ensure
consistent and controlled access to data, including for the purposes of regulatory compliance.

An essential part of data governance is to take inventory of all data assets. The best time to do
so is when you’re moving to a new data stack for the first time, as you should be reviewing all

The Data Leader’s Guide to Modern Analytics 6


of your data assets anyway. Creating a “single source of truth” means careful documentation
of all data assets and the creation of canonical definitions. This effort can be owned by a data
governor. They should be familiar with the data, detail-oriented, conscientious and capable of
diplomatically handling hard conversations. You will also need a sponsor in upper management
to supply the mandate and support for this effort.

Level 3: Visualization and decision support


The next step from building data models is to turn them into visualizations, dashboards and
reports to support decision-making by people within the organization.

Treat reports, dashboards and visualizations as discrete products in their own right. Analytics
projects benefit from product thinking, especially in the guise of agile methodology. A keen
understanding of users and quick project cycles to produce minimum viable products (MVP) will
enable your analytics team to produce results quickly, pivot as needed and reduce the risk of
wasted work. It is usually a good idea to begin with a relatively low effort, high impact use case.
Sales, marketing and product analytics are obvious candidates.

Starter KPIs
The following are important early metrics:

Revenue Sales and Marketing Product


• Annual recurring revenue • Customer growth and • Daily, weekly and monthly
(ARR) churn rate active users
• Net revenue retention (NRR) • Month-over-month • Customer journey
• Unit economics: customer revenue growth • Feature usage
acquisition cost, sales • Marketing qualified lead • Net promoter score
efficiency (MQL) and conversion metrics
• Other funnel metrics

The Data Leader’s Guide to Modern Analytics 7


Another important need at this juncture is to promote data literacy in your organization. There
are three stages of data literacy:

1. Interpret visualizations, dashboards and reports in order to make decisions. Senior (especially
C-suite) leadership and junior contributors alike should be able to read and intelligently act
on what analysts create.
2. Construct new data models from existing raw data. This is the bread and butter of analysts
as well as data literate individual contributors.
3. Explore new sources; construct predictive models. This is the purview of analysts and
data scientists.

Training will be essential to evangelizing the importance and benefits of data literacy as well as
creating a common language for learning.

Level 4: Business process automation


As your company comes to depend more on data for decision support, the next step is to close
the loop by feeding analytics data back into operational systems using processes such as reverse
ETL. There are many business processes that can be automated this way. One simple example is
copying sales analytics data back into payroll to automatically award commissions and bonuses.
Another is to embed dashboards and visualizations into operational systems to give employees
real-time insight into the effects of their activities.

As your organization continues to operationalize its data, you must ensure that as many data
operations are automated as possible. Make sure the technologies you use to support your data
operations can support automated reporting, quality control of data and the ability to push data
back into operational systems.

The Data Leader’s Guide to Modern Analytics 8


Level 5: AI/ML and hiring data scientists
Artificial intelligence and machine learning represent the most sophisticated business uses
of data, including predictive modeling, automated pattern recognition and autonomous
artificial agents. Making these pursuits production-ready depends on a mature foundation
of technologies, processes and talent. Many organizations hire data scientists prematurely,
long before they have the technologies or processes to properly leverage expertise in artificial
intelligence and machine learning.

Business applications of machine learning include revenue and profit projections, predictive
modeling to describe likely outcomes and tradeoffs of major decisions, recommendation systems
for customers and every manner of business process automation. The most sophisticated
applications of machine learning are products in their own right, such as self-driving cars.

Organizations aren’t born with mature data operations and a data-driven mandate from
leadership. So how do you know if you are ready to make a change and what do you do?

The Data Leader’s Guide to Modern Analytics 9


Signs you should start modernizing
your analytics
Analytics modernization is a process that will continue as long as your data needs continue to
grow in complexity. The following problems are signs that you urgently need to make serious
analytics modernization efforts. Be warned that they can easily coexist with each other!

Manual reporting is no longer tenable


Manual reporting (and data orchestration) is appropriate if your organization is at such an
early stage that you’re still figuring out product-market fit or you only have a single, very simple
data source. The moment you begin combining data from multiple tools, turnaround times for
analytics projects will markedly increase, delaying the positive business impacts of
your analytics.

Manual reporting becomes a time- and labor-intensive recipe for your analysts to start dropping
the ball in countless ways:

• Misnamed data fields and files


• Misplaced files
• Orchestration bottlenecks, i.e. a process completely stalls when a single person is away
• Long project turnaround
• Competing narratives from data models of unclear provenance

You will need to consider a more sustainable, longer-term strategy to avoid unnecessary work.

The Data Leader’s Guide to Modern Analytics 10


You have a large (and growing) number of data sources
Your company's data needs become more complicated as you add data sources through
mergers, expanded product lines, more complicated operations and growth in general.
Eventually, decision makers in your organization will ask important questions that you will only
be able to answer by combining and analyzing data from multiple sources.

The Guardian has a huge number of different products with different data
assets, even in different clouds, all with different formats, different standards.
We realized that we really needed to build a holistic view of our consumer base.
– Jonathan Rankin, Senior Product Manager, The Guardian

Excessive data engineering costs


High data engineering costs often accompany either the aforementioned manual reporting
or the extensive use of legacy technologies that require custom configuration and scripting.
You likely experience very high data engineering costs if you use on-premise data integration
technologies or if your data integration process follows the traditional extract-transform-load
(ETL) rather than extract-load-transform (ELT).

The Data Leader’s Guide to Modern Analytics 11


ETL Vs. ELT
Extract-transform-load (ETL) has long been the standard approach to data integration. When
ETL was first devised in the 1970s, most organizations operated under very stringent storage,
computation and bandwidth constraints. By transforming before loading, ETL limits the volume
of data that is warehoused and preserves storage, computation and bandwidth resources
throughout the entire workflow.

The downside is that, by including both extraction and transformation in the data pipeline, ETL
is fundamentally brittle. Any change to either upstream data sources or downstream analytics
models requires a rebuild of the entire data pipeline. It is a complicated, engineering-intensive
process with lengthy project turnaround.

By contrast, the extract-load-transform (ELT) approach to data integration moves all data from
source to destination first, allowing transformations to be performed within a data warehouse
environment. This means extraction and loading simply replicates all available data to the
destination, without consideration for downstream analytics models.

The Data Leader’s Guide to Modern Analytics 12


The main benefits of ELT are:
1. It becomes possible to automate and outsource extraction and loading as they no longer
have to be custom-tailored to the needs of analysts.
2. Transformations become an analyst-centric activity, as is appropriate. Analysts, after all, are
responsible for the analytic models used in visualizations, dashboards and reports.

A recent survey by Wakefield Research of large (2,500+ headcount) companies found an


average yearly cost of $520,000 to build and maintain data pipelines. You will know if you are
experiencing this problem if your data engineering headcount keeps growing with no obvious
gains in analytics performance.

Reporting is underutilized
Your analysts may produce plenty of data assets such as dashboards and reports, but people in
your organization might not use them regularly. This can mean that people who are responsible
for knowing important metrics might struggle to answer relatively simple questions about
trends and summary findings, especially at more granular levels.

This is usually the result of a lack of awareness, distrust in the data’s provenance, unclear
ownership or a general lack of data literacy.

The Data Leader’s Guide to Modern Analytics 13


How you know you’ve succeeded
The process of modernizing your analytics will never be fully over. There will always be more data
sources to add and more data models to construct. However, there are several clear signs that
your organization is on the right track.

Time and money savings


Engineering time is scarce and best directed to higher value projects. A new report by Wakefield
Research estimates that an average data engineer spends 44 percent of their time maintaining
data pipelines instead of building critical infrastructure and production systems. You should have
much quicker project turnaround time after you build a modern data stack, sometimes to the
tune of orders of magnitude. Inasmuch as time is money, this should also lower the monetary
cost of your data operations.

Data utilization
The people in your organization should be using the aforementioned resources to inform and
justify decisions. This habit should be commonly practiced by leaders and individual contributors
alike. A good metric is the percentage of your organization that consults your BI tool at least
weekly. Aim for adoption rates over 50 percent.

New projects and data sources


Your analysts and engineers should be freed up to spend more time analyzing data and building
new tools, respectively. This will open up many opportunities to add new data sources, build new
data assets and expand the resources available to decision makers across your organization.
Ultimately, your organization should be empowered to pursue higher-value uses of data, such as
machine learning and artificial intelligence.

We will now cover, in detail, the steps required to ascend the hierarchy of data needs
outlined earlier.

The Data Leader’s Guide to Modern Analytics 14


Step 1: Setting up a modern
data stack
Enabling your organization to extract and load data is the first step to modernizing your
analytics. This first step can be accomplished with no more than an analyst or two, as the first
changes are really technological. This first step has a modest barrier to entry, as it requires
nothing more than purchasing and implementing off-the-shelf tools. You have a choice of many
automated, fully-managed cloud services. Automation is key because data engineering talent is
very scarce and expensive and data engineering itself is complicated.

“A traditional ETL methodology takes time. It takes time to recruit engineers


and analysts. The value that I wanted to create at Princess Polly was the value
of automation in the analysis – I wanted to provide metrics, rather than spend
time cleaning and moving the data.”
– Anand Bhatt, Head of Business Analytics, Princess Polly

As a general rule, you are looking for tools that are easy to use and have a lot of out-of-the-
box functionality. The tools you choose should almost obviate the need for your organization to
perform complicated data engineering work, at least at first. It should complement other tools
in an ecosystem, as well.

Recall that a modern data stack includes the following tools and technology:
• Data pipeline
• Destination
• Transformation tool
• Business intelligence platform

Select these tools and technologies carefully. You want them all to be cloud-based and
compatible with each other. The following are some more specific selection criteria for each
type of tool.

The Data Leader’s Guide to Modern Analytics 15


For your data pipeline:
• Make sure it supports data sources your organization uses and is likely to use in the future.
• Favor ELT (extract-load-transform) rather than ETL (extract-transform-load). ELT enables a
far greater degree of automation than ETL.
• Find a replication interval that suits your reporting needs. An extract, load and update
interval that’s a few minutes long is considered “real-time.”
• If possible, ensure that data is organized into schemas upon loading to a destination. Raw
streams can be a headache for your analysts to interpret.
• Reliability and performance are critical. Make sure the tool has a good reputation for uptime.

For your destination:


• For most use cases, the importance of a cloud-based architecture bears repeating! Some
specific use cases do require on-premise hardware but are beyond the scope of this guide.
• Make sure it supports the data your organization uses. In the early stages, this will be
structured, tabular, relational data, so look for a platform that can support such data,
namely a data warehouse.
• It must be able to scale compute and storage as needed, to handle growing data needs and
many concurrent queries.

For your transformation tool:


• Consider your team’s existing skills. Different approaches to transformation may leverage
different skills such as SQL, Python and so on.
• Check for off-the-shelf data models for common metrics from common sources. You want to
get started as quickly as possible.
• Ensure it supports collaboration, testing, version control and other software development
best practices.

Finally, for your business intelligence platform:


• Sometimes tools that are easier to use aren’t as easy to scale to larger numbers of users.
You may encounter tradeoffs between ease of use and scalability.
• Consider your team’s existing modeling skills and experience. What kinds of proprietary
languages or techniques will they have to learn to use the BI platform?
• Performance can be a major pain point – people will be frustrated with a tool that’s slow
and unresponsive.
• Automated reporting and notifications will make it easier to scale your data operations.

Modern tools tend to offer free trials. You should be able to rapidly assemble a proof of concept,
create an MVP and demonstrate its efficacy for stakeholders.

The Data Leader’s Guide to Modern Analytics 16


Step 2: Grow your data team
and establish data governance
standards
Building a mature data team involves both people and processes. A data team always starts
with analysts, but as your data modeling and transformation efforts grow in complexity, you will
have to create and fill new roles and develop a system to manage your data.

Grow your data team


Traditionally, building a data stack was a highly technical process involving a number of
specialized developer roles, ranging from ETL specialists to database administrators and
architects. The modern approach simplifies and condenses these roles.

Data analysts are the bread and butter of any data team. Their main responsibilities are to
produce visualizations, dashboards and reports using business intelligence platforms. They are
usually well-versed in SQL and may also know a scripting language like Python, R or Java. It’s
important to have intellectual and experiential diversity among analysts, especially people with
valuable domain knowledge in certain business functions, industries and other concerns.

With the help of a modern data stack and automation, your data team should be able to get by
with only analysts for quite some time.

The Data Leader’s Guide to Modern Analytics 17


As your infrastructure needs become more complicated, data engineers may become necessary.
A naive use of data engineers is to produce data pipelines; this is a task obviated by many
modern data integration tools. However, with the work of building and maintaining ELT
outsourced and automated, the role of a data engineer becomes a wider responsibility focused
on performance, reliability and enabling data teams and data consumers. This usually involves
building cloud-based data infrastructure. As such, data engineers tend to have deep computer
science or software engineering backgrounds.

Without the aid of a modern data stack, it is necessary to maintain a large team of data
engineers to build your own bespoke data stack. This can be extremely costly and time-intensive.

The Data Leader’s Guide to Modern Analytics 18


Data scientists are mainly responsible for producing machine learning models. This requires
heavy grounding in statistics, predictive modeling and scripting languages. Data scientists
often have advanced degrees in a quantitative field. Many organizations prematurely hire data
scientists and press them into work more appropriately within the purview of data analysts or
data engineers.

In addition to the main analytics roles, roles that emphasize coordination will become important
as your data operations mature. Data product managers perform the same role as other
product managers, but for data- and analytics-related assets. They are responsible for defining
what to build and guiding analysts through the creation of data assets. Last but certainly not
least, you will need to designate a data governor to own the task of data governance.

Organizational design
As your data needs grow, you will need organization and a clear division of labor. For
accountability purposes, data assets and workflows also need clear owners.

There are arguments in favor of both centralizing and decentralizing data teams, but a good
compromise is both, in the guise of a hub-and-spoke model.

The Data Leader’s Guide to Modern Analytics 19


Your “hub” is a centralized team that owns the overall data integration process and creates
and maintains the less specialized data models, reports and visualizations used by company
leadership and individual contributors alike. The “spokes” are small, functionally-aligned teams of
analysts who are embedded with specific departments and have the relevant domain expertise
in areas such as sales, finance and so forth. The hub and spokes alike should report to company
leadership. It is worth reserving a seat in the C-Suite for this – a chief analytics officer, chief data
officer or equivalent.

Establish data governance


The other major process consideration is data governance. Ownership of data assets also means
carefully documenting and cataloging all data assets. This becomes ever more important the
more data you ingest and the more products or lines of business you develop. An ungoverned
“data swamp” can turn your data unusable. It can also make compliance with regulatory
standards (and basic ethical considerations) difficult or impossible.

To avert this problem, consider a cloud data catalog tool. It will help you take the
following actions:

• Document all models, tables and fields. This may be impractical if you have many data
sources; an alternative is to carefully build a dimensional schema, which is a simplified data
model that encompasses all major operations.
• Determine what metrics you need and where they come from.
• Make note of how frequently you need to refresh the data.
• Plan to address any data integrity issues.
• Identify the true data owners for the various models within the organization.
• Assign ownership and create incentives to keep the system healthy.

The best time to make this effort is as you start fully implementing your modern data stack, as
you will need to take inventory of all data assets anyway.

Get data governance under control early in order to build trust. Without a clear provenance for
every data model, it will be difficult for the end users of your data to make sense of how metrics
are determined and resolve conflicting narratives.

The Data Leader’s Guide to Modern Analytics 20


Step 3: Apply product thinking and
promote data literacy
As your data team moves on to visualization and decision support, you will need to make a
concerted effort to bring product thinking to your analytics efforts. We previously mentioned
that a data PM would be among the new roles that you will need to fill. Your data PM will lead
the initial effort to create an MVP data asset.

Apply product thinking


In a nutshell, product thinking means understanding your users (i.e. individual contributors
and leadership in your organization) and rapidly, iteratively producing and refining products in
response to changing conditions. In a little more detail, the product process for data
assets involves:

Identify
• Understanding users
• Gathering requirements

Design
• Defining scope
• Managing expectations

Develop
• Rapid prototyping
• Productionizing

Launch
• Marketing and rolling out the product
• Training users via office hours and internal communications
• Driving adoption, including through self-service whenever possible

Assess
• Evaluating against expectations and KPIs

The Data Leader’s Guide to Modern Analytics 21


Your first MVP should concern a low effort, high impact matter. A common first step is
something related to sales or marketing analytics. It’s important to understand your audience.
Executives, analysts and other individual contributors all need varying levels of detail.

You will need to build a roadmap regarding how these data assets will improve decision-making
at your organization. List the milestones and end goals, as well as the information and insights
necessary to achieve them. Your roadmap is something you can bring to your leadership with the
possibility of being incorporated into your organization’s strategy.

The Data Leader’s Guide to Modern Analytics 22


There are some specific considerations when it comes to messaging a data-driven agenda.
People can be skeptical of data and lack expertise navigating the appropriate tools and metrics.
They may even have dystopian preconceptions of how data can be weaponized. Avoid buzzwords
and deep technical details in favor of emphasizing how desired, data-driven outcomes are
achieved. Data-based storytelling is key – you will have to seamlessly combine strategy,
visualizations and narratives. This will inspire people to become data literate and begin
using data.

Promote data literacy


As your data team promotes product thinking, you will also need to spread data literacy across
both leadership and individual contributors. Some people use the phrase “citizen analyst” to
describe distributed decision-making. It isn’t reasonable to demand analysts and other data
professionals to interpret everything on behalf of decision-makers.

It can be useful to think of data literacy in terms of the Diffusion of Innovation Theory. Imagine
an S-shaped curve in which the leading 2.5 percent of people are innovators and trailblazers;
another 13.5 percent are early adopters; another 34 percent are the early majority. The key
challenge to getting started is to find an enthusiastic and, ideally, technically-savvy person in
a position of influence to evangelize to their team on your behalf. As teams become more data
literate and more capable, your efforts should eventually snowball and gain momentum of
their own.

Going forward, making data literacy


a key criterion for hiring will make it
easier to improve the data capabilities
of your teams. Remember, the
baseline isn’t writing SQL or building
new models but a general ability to
interpret graphs and tables and make
decisions accordingly.

The Data Leader’s Guide to Modern Analytics 23


How to tell if someone is data literate
Interview questions for data literacy don’t need to be tricky or complicated. The answers just
have to demonstrate an ability to recognize patterns and apply some reasoning. Include an
analyst on your hiring committee to see how candidates organize, explore and interpret data.

For instance, you could have a member of your data team ask an applicant to interpret a chart
like the following:

There are two obvious takeaways from this chart. One is that tourist visits to Bali are strongly
seasonal, with yearly peaks in January; not surprising, given Bali’s proximity to the equator.
Another is that tourism has grown steadily over the course of a decade.

The Data Leader’s Guide to Modern Analytics 24


Step 4: Build a robust data
architecture (and beyond)
Data architecture refers to a full arrangement of tools and processes used for data integration.
As your data operations mature, you will add new tools and technologies, use more features of
existing tools and create new workflows. It will become ever more important to “close the loop”
by operationalizing analytics data. This consists of ensuring that reports get to where they are
needed, maintaining good standards of data quality control, automating business processes
and, in the long term, setting your organization up to productionize machine learning models.

Set your team up to accommodate new data sources


Your ELT data integration workflow should be able to easily add new data sources as well as
combine new data sources with existing data sources into new data models. Recall what we
discussed earlier about the hub-and-spoke model and assigning specialized analysts to handle
specific functional domains within your organization. This domain expertise includes familiarity
with the relevant data sources, which is necessary for turning both new and existing data
sources into useful data models. The ability to construct important metrics that track a single
entity across multiple stages of a funnel, like customer lifetime value, depends on the ability to
merge data sources.

The Data Leader’s Guide to Modern Analytics 25


Schedule automated reporting
You can save a great deal of analyst time and effort by automatically updating reports as new
data is available and notifying key decision-makers on a regular schedule or when important
data changes. This kind of reporting is typically supported by BI platforms. Make sure you
configure yours accordingly.

Programmatically control your data pipeline


There are two main uses of automating your data pipeline. One is to facilitate data governance
and automate more complex data integration workflows by programmatically assigning
permissions and managing connections to specific data sources.

The other is for internal analytics. Programmatic control of your data pipeline can also be used
to establish reproducibility and trust in your data models through a system of automated alerts
to monitor the health of your data integration workflows.

The Data Leader’s Guide to Modern Analytics 26


Automate business processes
In order to productionize your analytics data, you will need some way to feed it back from your
data models to operational and production systems. This includes both off-the-shelf tools for
internal use, such as CRMs and any number of customer-facing products. You can use reverse
ETL and similar replication methods (as shown below) to close the loop with off-the-shelf
systems, though it may take more intensive engineering effort to build the appropriate solution.

Step 5: Prepare for artificial intelligence and machine


learning
The final element of the hierarchy of data needs, the pursuit of productionized machine learning,
is a bit farther afield than most of the topics we have discussed in this guide. It feeds directly
from the capabilities previously described – you will need mature infrastructure as well as heavy
engineering expertise to support the efforts of your data scientists.

The Data Leader’s Guide to Modern Analytics 27


Cautionary notes and pitfalls
to avoid
In previous chapters, I outlined a progression of steps to modernizing your analytics. Given the
immense benefits of mastery over data, it is tempting to take shortcuts in the process. Here are
a number of actions to avoid.

Don’t prematurely pursue vanity projects with high costs and risks, especially if they involve
machine learning (and especially if it’s at the behest of a consultant!). Many organizations hire
data scientists well before they are ready to do any data science.

Don’t let data governance slide. Without clear ownership of data models and the utmost trust
in the provenance of data models, people in your organization will hesitate to use your data to
make decisions. It will also make compliance with laws, regulations and ethical standards
more difficult.

The Data Leader’s Guide to Modern Analytics 28


Don’t over prioritize low-hanging fruit. The previous admonitions concerned prematurely
pursuing high effort, high impact projects. You should start with relatively low effort, high
impact work, but you will eventually reap all the gains from low effort initiatives and will have
to make longer-term commitments to bigger projects.

Don’t make extra work for yourself. The whole point is to do what you used to do but better. This
shouldn’t be difficult if you choose the right tools — the “right tools” being the operative phrase.
Choose tools that are noted for labor-saving features.

Don’t emphasize negative, dystopian narratives about using data. Yes, data can be used for
purposes that are manipulative and frankly creepy. That is far from the whole story; insights can
also genuinely improve people's lives.

Finally, don’t delay getting started. The initial outlay of resources required to modernize your
analytics is very modest. You can, in principle, do it with a single analyst and several free trials of
technology. Start now!

Fivetran can help you with the first step of your analytics modernization journey.

Request a demo at https://get.fivetran.com/demo or start a free trial at


https://fivetran.com/signup today.

The Data Leader’s Guide to Modern Analytics 29


Fivetran is the global leader in modern data integration. Our mission is to make access
to data as simple and reliable as electricity. Built for the cloud, Fivetran enables data
teams to effortlessly centralize and transform data from hundreds of SaaS and on-prem
data sources into high-performance cloud destinations. Fast-moving startups to the
world’s largest companies use Fivetran to accelerate modern analytics and operational
efficiency, fueling data-driven business growth. For more info, visit Fivetran.com.

The Data Leader’s Guide to Modern Analytics 30

You might also like