You are on page 1of 26

How to use a

semantic layer for


data and analytics
A Practical Guide
Table of Contents

3 Introduction: What is a Semantic Layer? 17 Considerations for Choosing a Semantic Layer

4 Where the Semantic Layer Fits 21 A Framework for Evaluating Semantic Layer Solutions

5 Trends Driving the Need for a Semantic Layer 23 AtScale Cloud Data Performance Benchmarks

6 Top Challenges a Semantic Layer Solves 25 Making a Business Case for a Semantic Layer

12 Industry Use Cases for a Semantic Layer 26 Learn More


Introduction: What is a

www.atscale.com
Semantic Layer?
You may have heard the term semantic layer before; it’s been around for some time. Primary Reasons to Adopt a

A Practical Guide to Using a Semantic Layer


People invented semantic layers to mold relational databases and their SQL dialects into Semantic Layer
an approachable Interface for business users. In 1992, Business Objects patented the
• Establish a single source of
term and formalized their implementation as the Business Objects UniverseTM. From that business-vetted metrics and
point on, the concept of measure and dimensions as an abstraction of SQL has become analysis dimensions.
• Ensure data consumers can access
the preferred language for business users.
governed data with tools of their
choice.
Until recently, however, the semantic To expand on that, the explosion of Luckily, a semantic layer that’s
• Ensure performant access to data
layer was always closely tied to a self-service BI has freed business decoupled from the point of including summary statistics.
business intelligence (BI) platform. As users from relying on IT-prepared consumption can help ease these
• Keep costs low by managing cloud
long as enterprises remained within the analytics, but at the expense of data problems with data quality and
infrastructure smartly and keeping
confines of their BI vendor of choice, consistency and trust in analytics’ empower self-service analytics. A
management and administration
everything worked well. Today, there output. Business definitions and terms well-designed semantic layer can lead
simple
are more ways than ever to analyze have become mutable, malleable, and to better data-driven decisions. It’s a
data. Long gone are the days where subject to interpretation. While it’s critical part of the modern analytics
there was one BI platform to rule all. great that business users now have stack. Let’s learn more!
Tightly coupling a semantic layer to self-service BI tools, they also need
one analytics consumption style no to be working off of consistent, high-
longer makes sense. quality data. The cost of bad data is
enormous; According to IBM, poor
data quality costs the U.S. economy a
staggering $3.1 trillion annually.

3
Where the Semantic Layer Sits in

www.atscale.com
Your Data Stack

A Practical Guide to Using a Semantic Layer


As you can see in this diagram,
the semantic layer sits
between the point of analytics
consumption and the data
warehouse and data lake.
A semantic layer hides the physical
complexity from end users and provides
them with understandable business terms
and user-friendly data, instead of raw
SQL and database schemas. This level
of data virtualization makes data access
possible for any analytics consumer.

4
Trends Driving
As the volume of data in the cloud grows, data architects are increasingly
becoming more comfortable with data living in different locations and in different
platform architectures. However, this gives rise to a new challenge for IT: managing

www.atscale.com
data access and quality across multiple silos. A semantic layer becomes a critical

the Need for a


piece in a cloud data platform strategy (or a blended cloud and on-prem strategy).

Both data scientists and BI users need access to clean, understandable data.
Today’s self-service architectures often force analytics consumers to become data

Semantic Layer
wranglers and data engineers. In fact, the average data scientist spends over 45%
of their time preparing data rather than modeling it.

A Practical Guide to Using a Semantic Layer


Asking business users and data scientists to design their own metrics and analysis
dimensions every time they start a new project is both a massive waste of time
and a recipe for chaos and inconsistency.

Finally, a semantic layer can serve as a central governance gateway across the
enterprise, which is crucial as the number of silos and data access points explodes.
A semantic layer serves as a single point of access so IT can secure data and
control access across the organization. The same Big Data & Analytics Maturity
Survey referenced above shows that nearly 80% of enterprises rank security and
Powerful cloud data platforms like Snowflake, Google governance as critical to their success in the cloud.
BigQuery, Amazon Redshift, Databricks, and Microsoft
Azure Synapse have become the standard for enterprise
analytics stacks. According to the AtScale 2020 Big Data
& Analytics Maturity Survey, 61% of respondents currently
operate cloud data platforms, and 48% plan on deploying
them soon.

5
The top five challenges a

www.atscale.com
semantic layer can solve

A Practical Guide to Using a Semantic Layer


There are common problems that crop up without What follows is a deep dive into each of these challenges
a semantic layer facilitating decision-making in an and an explanation of how a semantic layer can help
organization. We can group these problems into five solve it.
areas:

1. Different analytics tool preferences

2. Lack of data access

3. Slow data integration leading to siloed solutions

4. Inconsistent BI reports across different business units

5. Low data confidence

6
#1 Business units have preferences

www.atscale.com
for different analytics tools

A Practical Guide to Using a Semantic Layer


Larger organizations have a tougher time imposing a Dresner reports that man enterprises A semantic layer neatly solves this
use three or more BI tools, with each tool problem by providing analytics-as-a-
single analytics standard across the board. This can be having its own source of truth. Throw service (AaaS) to your business users and
because of the disruption of an acquisition, resistance in possibilities of inaccurate reports data scientists. This lets you grant data
from business analysts or misleading access to your end users via their tools of
to change, or factors that limit management’s ability to
predictions from data scientists, and it’s choice while maintaining data governance
enforce unified standards. easy to see how multiple tools can lead to and semantic consistency.
multiple truths — and that’s not a good
thing!

And the pace of change in cloud data


warehousing, BI, and AI/ML has resulted in
a constant cycle of upgrades, replatforms,
and re-factors across different
organizations. From a time, cost, and
business impact perspective, it’s hard to
keep up with these changes.

7
#2 Users complain about

www.atscale.com
a lack of access to data

A Practical Guide to Using a Semantic Layer


Data is plentiful, but coherent data is another story. The research supports this, too — Gartner reports that 87% of
Business analysts and data scientists can’t rely on organizations have low BI and analytics maturity. You might
just any data. They need to understand the data in log have abundant data, but your data consumers struggle to make
files, relational tables, and other data stores through sense of it — and it’s hampering the speed at which they can
metadata. If that’s missing, it leads to time wasted on make accurate decisions. A semantic layer eases this pain by
interpretation and even inaccurate results that can hurt powering your data model with crucial context to aid decision-
business performance. making.

8
#3 The slow pace of data

www.atscale.com
integration drives businesses to DIY

A Practical Guide to Using a Semantic Layer


Business today moves quickly, and waiting for There’s a clear link between data This move to the cloud and rise of big
driven decision-making and business data have powered a BI revolution, leading
a centralized data team to produce reports and performance: MIT reports that companies to business users taking reporting and
dashboards for different departmental use cases is not a in the top three spots in their industry data engineering into their own hands.
who apply data-driven decisionmaking This is a positive shift. But it also has its
good option.
realized 5% more productivity and 6% drawbacks, with many data platforms and
more profit than their peers. data marts proliferating everywhere and
making data governance difficult. Such a
situation shows the need for a semantic
layer to simplify and streamline data
access and use.

9
#4 Reports

www.atscale.com
from different BI
tools use similar

A Practical Guide to Using a Semantic Layer


terms but show
different results
Of course, having multiple BI tools across the organization
results in differing results for similar queries. Each BI tool comes
with its own modeling layer, and all of them support custom
calculations, so it’s easy enough to create wildly divergent
reports off of the same data. That’s not even accounting for
table join errors, flawed time-based calculations, or just simple
formula mistakes. This leads to a common consequence:

10
#5 Business execs express

www.atscale.com
doubt in the numbers

A Practical Guide to Using a Semantic Layer


Experian reports that six in 10 companies believe that However, this isn’t the reality for most Using one source of truth naturally leads
businesses today. Many companies to more trust in the data, so if you find
high-quality data increases business efficiency, 44% cannot be sure of the reliability of their your business users employing different
believe it raises consumer trust, 43% conclude it data. This introduces doubt and delays in analytics tools to do their analyses, you
decision-making — a significant drawback may be suffering from a confidence crisis
enhances customer satisfaction, 42% believe it drives
considering that trust in data is a major that a semantic layer could solve.There
more informed decision-making, and 41% report that competitive advantage. are several approaches to implementing
good data cuts costs. a semantic layer in your organizations.
Below is a table with the pros and cons for
each:

11
A Cross-Industry
The potential for creating value with data, analytics, and
enterprise AI is a cross-industry opportunity. While use
cases may vary across industry and organization and

www.atscale.com
Challenge
departments, their are some fundamental challenges that
span all industries. In this section, we’ll explore common use
cases for a semantic layer drawn from different industries.

A Practical Guide to Using a Semantic Layer


12
Healthcare

www.atscale.com
A Practical Guide to Using a Semantic Layer
Many pharmaceutical and healthcare companies operate Among the many benefits of this approach is allowing healthcare companies
to focus data and analytics efforts on activities that impact profit and loss.
in highly complex and heavily regulated industries. As For a pharmaceutical company, even a single percentage point of efficiency
you might imagine, their businesses depend on data for improvements could have a tremendous impact on margins. The goal is to take a
forward-facing, predictive approach to data, rather than simply looking back on
success. Some teams choose to build their own data
reports of what has already happened.
and analytics platforms or make use of pre-existing
components. In either scenario, a semantic layer helps to This approach also dramatically simplifies their data accuracy, and reduces
replication of data across multiple data stores. In addition, it provides common
democratize access to data across the company. controls and a shared backlog so that business and IT teams can define work
in big room planning sessions and pull work from a common backlog for sprints.
Finally, a semantic layer provides crucial security and governance controls, so that
sensitive information remains protected (but more on that later).

13
Retail and

www.atscale.com
eCommerce

A Practical Guide to Using a Semantic Layer


Retailers and eCommerce providers rely on their data This agility stems from the ability of everyone on the team to be a data analyst.
When dealing with a high volume of traffic and the resulting mountain of data, the
and infrastructure to compete. With a plethora of options data team’s top priority is empowering business users to leverage whichever data
available to shoppers both online and in-store, the tools they like best while enabling them to get reliable, accurate answers quickly.
Adopting a semantic layer from AtScale helps teams accelerate time to insight
retailers with the best data-driven strategies can provide
from data, agnostic of their underlying infrastructure.
highly tailored recommendations and adapt to changing
customer preferences. Second, data producers need a set of technologies in order to do their jobs well.
This could include underlying dimensional models or training sets for a machine
learning model. Finally, infrastructure powers the activities of both user groups
(this includes compute engines and storage systems for data.)

Many large retailers have undertaken a transformation to cloud-based


infrastructure, which provides a perfect test case to use AtScale. The goal is to
drive end user adoption of cloud technologies through the implementation of a
semantic layer that democratize data access.

14
Consumer

www.atscale.com
Packaged Goods

A Practical Guide to Using a Semantic Layer


For data teams at consumer packaged goods (CPG) One major CPG has successfully managed to reduce data silos and enable
business users to consume it using a semantic layer. They have developed a
companies, it’s a natural fit to institutionalize the idea of logical model for the business that serves as a sort of “digital twin” for the physical
data as a product. In other words, their team treats data business. This semantic, logical model makes it possible for business users to
query data and get answers using terms that the team already understands.
as a path to unlocking value for the business user. Smart
CPG companies leverage concepts from engineering and Plus, by leveraging AtScale’s semantic layer, this company can separate the
product management in the software world and apply consumption of data from where that data actually lives and how it is stored. This
way, data can live anywhere and in any format without slowing people down.
those approaches to data, with impressive results. Any business user can ask questions and feel confident that they are receiving
correct, consistent answers.

15
Financial Services

www.atscale.com
A Practical Guide to Using a Semantic Layer
With a semantic layer, financial services organizations Legacy data structures, however, can Investing in a semantic layer can automate
create excessive silos. As data volumes the management of data engineering
can save millions in total cost of ownership from grow, business intelligence (BI), previously done by busy BI teams.
analytics, while simultaneously avoiding the risk development, and database engineering Automation can take the manpower out of
teams spend significant time managing data preparation, by aggregating raw data
of regulatory penalties. Due to federal reporting
caches and manually joining data based on end user behavior. All of this can
requirements, hundreds of analysts may need to drill from various sources. Meanwhile, the be done while enhancing existing security
down across thousands of business calculations to business has to bear the risk associated and governance controls, and mitigating
with penalties due to inaccurate or late risk of inaccurate reporting.
properly sign off and file reports on a regular basis.
reporting.
As a result, analysts have performant
access via a single source of truth,
meeting regulatory requirements. This
restores trust in analytics and allows
engineering, BI and data engineering
teams to spend their time on more
productive activities.

16
Key

www.atscale.com
Considerations
for Choosing a

A Practical Guide to Using a Semantic Layer


Semantic Layer
Now that you have a sense of how a semantic layer can solve
common data challenges across different industries, let’s
talk about how to go about selecting and implementing one.
Determining a technology strategy to implement a semantic
layer can be daunting, but there are eight key considerations
to keep in mind as you pick the best approach for your
organization.

Following these general considerations, we compare four


alternative technology approaches that organizations may
consider outside of a purpose-built semantic layer platform like
AtScale.

17
www.atscale.com
#1 NOT TIED TO A SINGLE #2 OFFERS TABULAR AND #3 SUPPORTS DATA PLATFORM #4 EASY MODEL DEVELOPMENT AND

A Practical Guide to Using a Semantic Layer


CONSUMPTION STYLE MULTIDIMENSIONAL VIEWS VIRTUALIZATION SHARING
As analytics have spread more within Semantic layers come in two flavors: tabular Data has lived in lots of different homes Raw data is near-useless, but adding a
organizations, relying on one BI or AI/ and multidimensional. over the years. First it was the mainframe, data model to it makes it consumable
ML platform to meet everyone’s needs is then the relational database, followed by information. The ideal semantic layer
The tabular (or relational) model became
becoming less realistic. Also, a semantic the data warehouse, the MPP database, the vendor should enable easy authoring,
popular in the 70s and 80s and relied
layer tied to one set of consumption tools data lake, and back again to the (this time, sharing, and collaborating on data models.
on concepts like fact and dimensional
is by design not “universal” — and in a cloud-hosted) data warehouse. It should also allow the reuse of common
tables. Tools based on this model were
landscape of many tools and analytics user objects and conformed dimensions, the
designed to make relational databases These evolutions have brought significant
personas, it’s crucial to choose a semantic ability to model data visually, and a code-
or data warehouses easier to query. changes to how data is accessed and used,
layer decoupled from a single consumption based approach that’s compatible with your
Multidimensional data layers go one step and savvy organizations hedge against
style or analytics tool. organization’s software development life
further by defining relationships and data obsolescence through virtualization.
aggregation rules and adding business- cycle.
Virtualization eliminates the cost of data
friendly context while negating the need for migrations every time a new trend grips the
SQL. industry. A semantic layer vendor should
It’s essential to choose a semantic layer tool offer data virtualization to abstract away
that offers both views to cover a broader platform differences and minimize lock-in.
range of uses and consumption styles

18
www.atscale.com
#5 ABILITY TO EXPRESS DIFFERENT #6 QUERY PERFORMANCE AND #7 SUPPORT FOR BI AND DATA #8 SECURITY & GOVERNANCE

A Practical Guide to Using a Semantic Layer


BUSINESS CONCEPTS AND CACHING SCIENCE WORKLOADS
FUNCTIONS Because the semantic layer sits between
Query performance and caching are critical The need for clean, usable data doesn’t
the organization’s data and the analytics
Relational data is flexible and powerful considerations in the selection process. end with just the business analyst — as
tools that access that data, the platform
but often difficult to express high-level A semantic layer needs consistent and referenced above, data scientists spend
must integrate with your organization’s
business constructs with. These constructs performant to be of any use to its users, approximately 45% of their time just
security infrastructure. This can happen in
include time-based calculations (e.g., who expect blazingly fast performance from prepping data for use. A common data
two ways: authentication and authorization.
period-over-period), semi-additive metrics, proprietary databases. language and business terms are more
ancestor/predecessor functions, etc. likely to ensure business analysts and First, the semantic layer must integrate with
This isn’t easy considering that many
Expressing these computations in SQL is data scientists have the same context and any existing single sign-on infrastructure
of today’s queries often include
challenging, so choose a semantic layer produce consistent results and predictions. to authenticate users, whether through
heterogeneous database joins that further
that supports business constructs and Choose a semantic layer that supports Active Directory, LDAP, OAuth, or any
tax query performance. To overcome this
core analytics requirements around time various workloads, including business other authentication platform. Second,
challenge, choose a semantic layer vendor
intelligence and hierarchical roll-ups. intelligence and data science. the semantic layer must include the ability
that includes a comprehensive performance
to mask sensitive columns, limit data
management system beyond simple
rows based on user access rules, and,
caching techniques.
crucially, impersonate users when querying
underlying sources. Choose a semantic
layer that incorporates these two critical
security and governance protocols.

19
BUSINESS INTELLIGENCE PROS CONS DATA VIRTUALIZATION PROS CONS
PLATFORMS PLATFORMS

www.atscale.com
Traditional BI platforms that + No extra - Semantic layer Platforms that abstract away the + Provides flexibility - Not friendly for
bundle data modeling, query technology specific to BI tool physical source and location in a in how/where data business users
management and visualization needed only (not Reusable) tabular format is stored (tables, columns)
+ Tight integration - Vendor lock in + Semantic layer - Data models need
+ Business user can be used to be built before

A Practical Guide to Using a Semantic Layer


friendly across a variety accessing data
EXAMPLE VENDORS EXAMPLE VENDORS
of tools
- Query performance
Tabletau, power BI, IBM, Cognos, Denodo, Dremio
is not guaranteed
SAP Business Objects, Looker
and/or need
manual tuning

DATA WAREHOUSE / PROS CONS OLAP PROS CONS


DATA MARTS PLATFORMS
A database of information from a + Single source of - Not friendly for A conventional cube-based + Accelerated query - Complex data
variety of data sources truth business users OLAP approach to accelerating performance pipelines and
+ Widest array of (tables, columns) analytics queries based on pre- + Cost efficient difficult to make
tool/query access - Slow to integrate defined dimensions and metrics. changes
new data sources + Good integration - Data is extracted
+ Easy to secure with Microsoft BI and disconnected
EXAMPLE VENDORS - Dependence on IT EXAMPLE VENDORS stack (Power BI from EDW
Snowflake, Amazon Redshift, Microsoft SSAS or PowerBI and Excel) - Semantic layer is
Google BigQuery, Premium, Kyligence, Kyvos not exposed to
Azure Synapse SQL Analytics. other BI platforms
- Dependence on IT

20
A Framework

www.atscale.com
for Evaluating
Semantic Layer

A Practical Guide to Using a Semantic Layer


Solutions
The following checklist includes detailed capabilities for you to
evaluate potential semantic layer solutions. There may be other
considerations unique to your organization, but this can serve as
a start as you plan trials and proof of value projects.

21
Feature Category Feature Score Weigh Weighted score
(1-5,5=best) (1-5,5=best) (1-5,5=best)

Supports analytical workloads


Use Cases
Supports data science workloads

www.atscale.com
Supports legacy, on-premise data warehouses

Supports on-premise and cloud data lakes

Connectivity Supports Saas data sources (Salesforce, Workday)

(northbound & Supports tools that speak SQL via JDBS or ODBS
southbound) Supports tools that speak MDX or DAX and live Excel connections

A Practical Guide to Using a Semantic Layer


Supports custom applications via REST or Phyton interfaces

Supports zero client install for data consumers

Supports web based development (versus client application)

Development Supports multiple, simultaneous editors for virtual view development


Environment Supports reusable objects and model component sharing

Supports development lifestyle (dev/test/prod)

Supports Time intelligence (period over period, period to date)

Calculations and Supports MDX, DAX, pre and post query calculations

Analytical Functions Supports aggregation functions (SUN, AVG, MAX, MIN)


(OLAP) Supports non-additive metrics (Distinct Count, First, Last)

Supports live Excel pivot tables and Excel CUBE functions

Query Performance Supports automated query performance management

& Catching Supports dialect specific optimizations

Supports single sign on for all data consumers

Supports user impersonation and delegated authorization

Security & Supports and respects native data platform security constructs
Governance Supports row level security for users and groups

Supports row level security for users and groups

Supports column hiding and masking for users and groups

T O TA L

22
Performance
Modern cloud data platforms semantic consistency. The key
have been a game changer for to maintaining a semantic layer
enterprise analytics programs. strategy is to deliver consistently

www.atscale.com
Benchmarks: How
Cost efficient cloud storage fast performance and managing
and powerful SQL-based query costs.
engines makes it feasible to make

AtScale Performs
petabytes of data available for AtScale was designed to deliver
speed of thought performance

A Practical Guide to Using a Semantic Layer


analysis. When data teams are
proactive in making a wide range while eliminating the need to

with Common of application data available extract data like conventional


in “analysis-ready” format, OLAP approaches.
organizations are positioned

Cloud Data to drive toward a faster rate of To demonstrate this, we ran 20


data-driven insight creation. queries both with and without
AtScale, using the standard TPC-

Warehouses However data availability


and SQL access to does not
automatically mean data can be
DS benchmark v2.11.0 from the
Transaction Processing Council
(TPC) for our tests. AtScale’s
delivered at “speed of thought” Acceleration Structures showed
or within truly interactive data major benefits in accelerating
dashboards. The raw scale query performance - both raw
of modern cloud data often query performance and when
result in multi-second or longer managing multipe concurrent
query times. Performance users. We also present measures
The following tables summarize these results. You can download the full
lags lead to workarounds and of simplification based on the
benchmark studies here. (embed this link https://www.atscale.com/resource/re-
performance-benchmarks/ data extracts which break reduction in SQL syntax.
23
www.atscale.com
improvement Factor with AtScale improvement Factor with AtScale
TEST TEST
Google BigQuery Redshift

Query Performance 7.7x Faster Query Performance 12.5x Faster

A Practical Guide to Using a Semantic Layer


User Concurrency 20x Faster User Concurrency 61x Faster

Compute Cost 10x Cheaper Compute Cost 2.6x Cheaper

Complexity 76% Less complex SQL querles Complexity 76% Less complex SQL querles

improvement Factor with AtScale improvement Factor with AtScale


TEST TEST
Snowflake Azure Synapse Analytics SQL

Query Performance 4x Faster Query Performance 3x Faster

User Concurrency 14x Faster User Concurrency 9x Faster

Compute Cost 3.7x Cheaper Compute Cost 2x Cheaper

Complexity 76% Less complex SQL querles Complexity 76% Less complex SQL querles

24
Making the
Over the past several years, organizations have focused
their analytics investments on embracing major cloud
data platforms. While modern data platforms simplify

www.atscale.com
Business Case for operations and solve many problems, they do not address
the fundamential problems of getting actionable data to
the consumers that want it. Building a business case for

a Semantic Layer investing in a semantic layer strategy should focus on three

A Practical Guide to Using a Semantic Layer


important value drivers:

1. Increase the value of data by making it easier to combine


more data sets, large data sets, and wider windows of data
in a way they are accessible to more data consumers.

2. Reduce the cost of delivering analysis-ready data by


optimizing cloud spend, simplifying data engineering and
data ops, and making data consumers more productive.

3. Increasing the number of data consumers by making it


easier to interact with data assets. Improving data literacy in
an organization opens up the flow of data driven insights.

25
Take the

www.atscale.com
Next Step

A Practical Guide to Using a Semantic Layer


REQUEST A DEMO 

CONTACT US 

LEARN MORE  Read the Technology Platform Overview for AtScale’s


Semantic Layers Solution

Buyer’s Guide: https://www.atscale.com/resource/wp-


About AtScale buyers-guide-semantic-layer/
AtScale enables smarter decision-making by accelerating
the flow of data-driven insights. The company’s semantic Benchmark reports: https://www.atscale.com/resource/re-
layer platform simplifies, accelerates, and extends business
performance-benchmarks/
intelligence and data science capabilities for enterprise
customers across all industries. With AtScale, customers are
empowered to democratize data, implement self-service Get advice from fellow data & analytics leaders on how to
BI and build a more agile analytics infrastructure for better, scale smarter data-driven decision-making
more impactful decision making. For more information,
please visit www.atscale.com and follow us on LinkedIn,
Twitter or Facebook. Stay current on analytics strategies with short articles and
webinars on topics in analytics strategies

26

You might also like