0% found this document useful (0 votes)
168 views41 pages

Data Mash - New Paradigm: Ristian Necula

Uploaded by

python apps
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
168 views41 pages

Data Mash - New Paradigm: Ristian Necula

Uploaded by

python apps
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

Data mash – New paradigm

CRISTIAN NECULA
TECHNOLOGY ARCHITECT

November 2021
Agenda

• What is data mesh?


• Definition, characteristics, why?
• Data Mesh architecture and usability
• Base architecture patterns
• Use cases
• Oracle solutions
• DEMO
Our customers are innovating…
The challenge we have is to meet our customers where they are today “as-is”
and help them on their journey to where they want to be in the future “to-be.”

Common customer initiatives include:

Digital / business transformation – organizational, process and technology


improvements designed to help with efficiency and competitiveness

Migration to cloud or multi-cloud – improving agility and generating cost


savings via infrastructure and SaaS partnerships

Application Modernization – may include SaaS adoption and/or a shift to


microservices and service mesh for agile and rapid feature delivery (CI/CD)

Data analytics – including investments into data lakes, data lakehouses,


modern data warehouses, data science and cloud-native BI tools

Business continuity – ensuring continuous operations in the face of


planned or unplanned downtime scenarios

But these initiatives can be very, very difficult … and many will fail. Oracle,
and the emerging Data Mesh approach can help.
…facing difficult odds
Because the old ways are not working well. Most business
transformation initiatives fail. Most of the time and costs for
digital platforms are sunk into ‘integration’ efforts 4. The old
monolithic tech architectures of the past are cumbersome,
expensive, and inflexible. Additionally:
Data mess!
• 70-80% of digital transformations fail
• Cloud lock-in is real, and can become very costly
• Data Lakes rarely succeed, and are only analytics focused
• Organizational silos exacerbate issues, causing
unnecessary problems with cross-functional programs
• Everything is speeding up, the pace of innovation, the
speed of IT events, your competition…etc
• Cost of operational data outage is rising

Data Mesh is no silver bullet, but the principles, practices and


technologies have been aligned to focus on solving some of the
most pressing, unaddressed data modernization objectives that
can help improve the success rate of business transformations.
New concept for data
Data Mesh approach:

1. Emphasizes cultural change, as a mindset shift towards


thinking of data ‘as a product’ – which in turn can prompt
organizational and process changes to manage data as a
tangible, real capital asset of the business.

2. Calls for alignment across operational and analytic data


domains. A Data Mesh aims to link data producers directly to
data consumers and remove the IT middleman from the
processes that ingest, prepare and transform data resources.

3. Technology platform built for ‘data in motion’ is a key


indicator of success – a two-sided platform that links enterprise
data producers and consumers. Data Mesh core is a distributed
architecture for on-prem and multi-cloud data.
Data Fabric & Data Mesh

Real-time
In some cases,
may not support
strong consistency
This is today’s
topic!
Data Mesh

Always supports
trusted, strong
consistency Enterprise Data Fabric

Batch
Centralized Data Decentralized Data
7 Copyright © 2021, Oracle and/or its affiliates
Evolution towards Real-Time Data Mesh
Monoliths Distributed

Industry 3.0: Hub and Spoke Transitional: Kappa Hub Mature: Distributed Kappa
mesh & microservice controls
ETL

ETL
ETL
ETL

Lambda: [Link]
[Link]
Kappa: [Link]

This data pattern, popularized by Ralph By 2010, the Lambda (big data) pattern By 2020, IT infrastructure has
Kimball and Bill Inmon, has been the was common. In 2014, Jay Kreps (of dramatically changed – networking,
foundation for enterprise data LinkedIn) questioned the Lambda containers, cloud, compute, IoT etc have
management since 1993. Architecture and spawned Kappa. all pushed data to the edge.

It is transaction consistent, can scale up The Kappa principles consider batch A mature Kappa architecture is not a
nicely for most use cases, and is based on processing as a special case of stream single instance “hub” but rather a
SQL, lingua-franca for most tools. processing. Use a historized event log to distributed mesh of data logs, stream
process both real-time as well as batch data processing, change events, and time
processing. series data.
Copyright © 2020 Oracle and/or its affiliates. 8
Beware the hype…
Since Data Mesh is a rising hot topic and still in the early days of maturity,
there may be some marketing content that uses the words “data mesh” but
the described solutions do not actually fit the core approach.

A proper Data Mesh is a mindset, an organizational model and an


enterprise data architecture approach…it should have some mix of
data product thinking, decentralized data architecture, event-driven
actions and a streaming centric ‘service mesh’ style of microservices
design.

A Data Mesh is not a…


• Single Cloud Data Lake …even with ‘domains,’ catalogs & SQL access
• Data Catalog / Graph …a data mesh needs a physical implementation
• Point-Product …no vendor has a singular product for Data Mesh
• IT Consulting Project …strategy/tactics still require platforms and tools
• Data Fabric …which is broadly inclusive of monolithic data
architectures
• Self-Service Analytics …easy-to-use UX can front a mesh or a
monolith
As the popularity of Data Mesh continues to increase, there will be many
bandwagon vendors/consultants, so it’s important to beware of the hype!
What is a Data Mesh?
a trusted Data Mesh is a data architecture approach focused on outcomes (data
products), IT agility in a multi-cloud world (service mesh) using a decentralized
architecture, trusted data of all kinds (polyglot data streams), and faster business
innovation cycles (event-driven integration).
intuitive, self-service user interfaces for data engineers and data product managers
data products streaming event ledgers systems of truth decentralized / distributed / multi-cloud / edge

microservices

Apps
app / IoT
events
1 2 3 4 5 6 7

ACID
data
events 1 2 3 4 5 6 7

API driven, secure service mesh (K8S, Docker etc) that runs anywhere

data product KPIs, SLAs | business domain modeling | provenance & explainability

10 Copyright © 2021, Oracle and/or its affiliates


What is a Data Mesh? Data Mesh is a data-tier architecture to integrate and
govern enterprise data assets across distributed multi-
cloud environments
Microservice Log-based
Patterns Event Integrations
Streaming
Microservices-centric:
• For the administration, deployment and monitoring of the core
Service Mesh Immutable frameworks of data movement and governance
“Sidecars” Logs
• Aligns with for Service Mesh frameworks (K8S, Istio, etc)
• “Sidecar Proxy” style pattern for Events and Data
Data
Data
Domain
Mesh
Mesh
Immutable event-logs for data integrations:
Driven
Data • Messaging and data store events are globally accessible via
Replication
Design
immutable event logs
• Logs may be used to drive Streaming or Batch integrations

Edge / 5G Polyglot Distributed data movement of all types of data


Frameworks Persistence
• A data mesh moves data: Relational, NoSQL, JSON, Graph…
• Relational data consistency (ACID) during data movement
Polyglot Data • Must work reliably with enterprise OLTP data sets
Movement [Link]

11
1 DATA PRODUCT
THINKING

Four Key Attributes


of a Data Mesh
2 DECENTRALIZED
DATA ARCHITECTURE

A Data Mesh should not be just a new buzz


word on top of an old tech architecture.

As Data Mesh aims to bring unique value, it


must have unique attributes that are distinctly
3 EVENT-DRIVEN
DATA LEDGERS

4
different from commonplace solutions that
have already been around for decades.

These are four key attributes to be aware of.


POLYGLOT
DATA STREAMING
Data
1.) D
ATAPMesh
T Attribute
RODUCT HINKING
A mindset shift is the most important first step towards a Data Mesh.
The willingness to embrace the learned practices of innovation is the
springboard towards successful modernization of data architecture.

These learned practice areas include:

• Design Thinking – for solving ‘wicked problems’

• Jobs to be Done Theory – customer focused innovation, and


the Outcome-Driven Innovation process
Data
Attributes Design Thinking methodologies bring proven techniques that help
break down the organizational silos frequently blocking cross-
functional innovation. The Jobs to be Done Theory is the critical
foundation for designing data products that fulfil specific end-
Data consumer goals, or jobs to be done – it defines the product’s purpose.
Product
The data product approach initially emerged from the data science
Business User community but is now going mainstream, being applied to all aspects
Needs Needs of the data management discipline. It keeps the focus on the business
outcomes, the data consumers… rather than the IT tech.

Data product thinking can be applied to other data architectures, but


it is an essential part of a data mesh.
Data
1a.) D
ATA Mesh
P Attribute
RODUCTS
Products of any kind, from raw commodities to items at your local
store are produced as assets of value, intended to be consumed
and with a specific ‘job to be done.’

Data Data products can take a variety of forms, depending on the


Products business domain or problem to be solved, and may include:
• Analytics – historic/real-time reports & dashboards
Data • Data Sets – data collections in different shapes/formats

Assets • Models – domain objects, data models, ML features


• Algorithms – ML models, scoring, business rules
• Data Services & APIs – docs, payloads, topics, REST APIs…

Business Data A data product is created for consumers, requiring tracking of


additional attributes such as:
• Stakeholder Map – who creates and consumes this product?
• Packaging, Documentation – how is it consumed?
Digital Noise • Purpose & Value – implicit/explicit value? depreciation?
• Quality, Consistency – KPIs and SLAs of usage?
• Provenance, Lifecycle & Governance – trust & explainability?
Data
1b.) C Mesh
- Attribute
D DROSS FUNCTIONAL ATA OMAINS
The ‘wicked problem’ is often in aligning different cross-functional teams to common data domains – domains that require
shared data sets, data models, business policies and business rules.

data refinement zones, levels


of curation… eg; may be
Zone 1 Zone 2 Zone 3 Zone 4 across clouds, object store
buckets, DB schema, etc

Data Domain A business domains, logical


boundaries… may be
ontology categories, data
catalog tags, DDD bounded
contexts, etc.
Data Domain B
data products
may be sourced data products may exist at
Data Domain C from any zone different refinement levels
(eg; raw, curated, master, etc)
Data
2.) D MeshD Attribute
A ECENTRALIZED ATA RCHITECTURE
Decentralized IT systems are a modern reality, and with decentralization may be across
the rise of SaaS applications and public cloud physical sites, cloud networks,
infrastructure (IaaS), decentralization of applications or edge gateways
and data is here to stay.
Application software architectures are shifting away
from centralized monoliths and towards distributed
microservices (a service mesh).
Data architecture will follow the same trend towards
decentralization, with data becoming more distributed
across a wider variety of physical sites and across many
networks. We call this a Data Mesh.
Distributed software is hard. Just as nobody does
microservices architecture because it is easy, nobody
should try Data Mesh believing it is simple. There are
many good reasons and many benefits to having a
modular decentralized data, but a monolithic and
centralized data architecture is often simpler. data zones may reside in data consumers might
When the business benefits from decentralized data, different physical data stores consume data products from
Data Mesh patterns can keep the solution manageable. (obj store, databases, etc) any site/zone in the mesh
Data
2a.) M Mesh Attribute
ESH
The word ‘mesh’ means something specific – in tech, it is a particular
kind of network topology setup so that a large group of non-hierarchical
defining pattern of a nodes can collaboratively work together.
mesh is non-hierarchical, Some common tech examples include:
collaborative network
• WiFi Mesh – many nodes working together for better coverage
• ZWave/Zigbee – low-energy smart home device networks
• 5G Mesh – more reliable and resilient cell connections
• Starlink – satellite broadband mesh at global scale
• Service Mesh – a way to provide unified controls over
decentralized microservices (application software)
Data Mesh is aligned to these mesh concepts, and provides a
decentralized way of distributing data across virtual/physical networks
and across vast distances.
Legacy data integration monoliths (such as ETL tools, data federation
tools etc.) and even more recent public cloud services (such as AWS
Glue) require highly centralized infrastructure.
A complete Data Mesh solution should be capable of operating in a
multi-cloud framework, potentially spanning from on-premises, multiple
public clouds, and even to the edge networks.
Data
2b.) D Mesh
S Attribute ISTRIBUTED
In a world where data is highly distributed and decentralized, the role of
ECURITY
information security is paramount. Unlike highly centralized monoliths,
distributed systems must delegate out the activities necessary to
authenticate and authorize various users to different levels of access.
Securely delegating trust across networks is hard to do well.
Some considerations include:
• Encryption at rest – as data/events are written to storage
• Distributed authentication – for services and data stores
• Eg; mTLS, Certificates, SSO, Secret stores and data vaults
Figure 1: distributed authorizations using OPA sidecar in microservices
• Encryption in motion – as data/events are flowing in-memory
• Identity management – LDAP/IAM type services, cross-platform
• Distributed authorizations – for service end-points to redact data
• For example: Open Policy Agent (OPA) 15 sidecar to place Policy Decision
Point (PDP) within the container/K8S cluster where the microservice end
point is processing. LDAP/IAM may be any JWT capable service.
• Deterministic masking – to reliably and consistently obfuscate PII data
Security within any IT system can be difficult, and it is even more difficult
to provide high security within distributed systems. However, these are Figure 2: distributed mTLS authentication using secure certificates
solved problems with known solutions.
Data
3.) E Mesh
- D Attribute
L
VENT DRIVEN ATA EDGERS
Ledgers are a fundamental component of making a distributed data
General Purpose Event Ledger architecture function. Just as with an accounting ledger, a data ledger
• optimized for high volumes records the transactions as they happen.
• simple payload semantics
• pub/sub interfaces When we distribute the ledger, the data events become ‘replayable’ in
any location. Some ledgers are a bit like an airplane flight recorder,
used for high availability and disaster recovery.
Data Event Ledger
• optimized for DB transactions
Unlike centralized and monolithic data stores, distributed ledgers are
• ACID level Tx semantics purpose-built to keep track of atomic events and/or transactions that
• point-to-point / point-to-broker happen in other (external) systems.
A Data Mesh is not just one single kind of ledger, and can make use of
Messaging Ledgers different types of event-driven data ledgers, depending on the use
• optimized for guaranteed Tx’s cases and requirements:
• transaction processing system semantics
• General Purpose Event Ledger – such as Kafka or Pulsar
• pub/sub interfaces, transient payloads
• Data Event Ledger – distributed CDC/Replication tools
• Messaging Middleware – including ESB, MQ, JMS, and AQ
Blockchain Ledger
• optimized for multi-party transparency • Blockchain Ledger – for secure, multi-party transactions
• immutable transaction semantics
• API based interfaces (differs by type) Together, these ledgers can act as a sort of durable event log for the
*included for completeness, but not discussed in depth whole enterprise…providing a running list of data events happening on
systems of record and systems of analytics.
Data
4.) P Mesh
D S Attribute
OLYGLOT ATA TREAMS
Data streams may vary by event types, payloads
and different transaction semantics, a Data Mesh T3VyIG1pc3Npb24gaXMgdG8gaGVscCBw
ZW9wbGUgc2VlIGRhdGEgaW4gbmV3IHdh Simple, flat &
should support the necessary stream types for a Telemetry Events
eXMsIGRpc2NvdmVyIGluc2lnaHRzLCB1 record at a time
variety of enterprise data workloads. (devices & things) bmxvY2sgZW5kbGVzcyBwb3NzaWJpbGl0
aWVzLg==
Simple Events:
• Base64 / JSON – raw, schemaless events
syntax = "proto3"; Record at a time,
• Raw Telemetry etc. – sparse events package moviecatalog;
records have
message MovieItem {
simple schema
Basic App Logging / IoT Events: App/Process Events string name = 1;
(biz process & logging) double price = 2;
• JSON / Protobuf – may have schema bool inStock = 3;
}
• MQTT etc. – IoT specific protocols <?xml version="1.0" encoding="utf-8"?> May be deeply
<Root xmlns="[Link] nested, complex
Application Business Process Events: <Customers> schemas
<Customer CustomerID="GREAL">
• SOAP/REST Events – XML/XSD, JSON etc. <ContactName>Howard</ContactName>
Data Events <ContactTitle>Manager</ContactTitle>
• B2B etc. – exchange protocols & standards (ACID transactions)
Follows DB log /
Data Events / Transactions: transaction
• Logical Change Records – LCR, SCN, URID etc. boundaries

• Consistent Boundaries – commits vs. operations


Data
4a.) S Mesh
D P Attribute TREAM ATA ROCESSING
Stream processing is how data is manipulated (1) systems of (2) data processing in (3) data loaded to
within an event stream. Unlike ‘lambda functions’ record produce one or more data services, ledgers,
the stream processor maintains statefulness of data raw data events pipelines / streams storage or DWs
flows within a particular time window.
Basic Data Filtering: 1..n 1..n

• Thresholds, alerts, telemetry monitoring etc.


Simple ETL:

Systems of Analysis / Engagement


• RegEx functions, math/logic, concatenation

Enterprise Data Ledgers


• Record-by-record, substitutions, masking
CEP & Complex ETL:
• Complex Event Processing (CEP)
• DML (ACID) processing, groups of tuples
• Aggregates, lookups, complex joins etc.

Stream Analytics: Filter Queries Time Series


Aggregate Data Patterns Spatial Analytics
• Time series analytics, custom time windows Correlate/Enrich Windowing Anomalies
• Geospatial, machine learning and embedded AI Thresholds Data Policies Classification
Joins Business Rules Scoring Models

Page 21 - Enterprise Data Mesh: Solutions, Use Cases and Case Studies
“By integrating real-time operational
data and analytics, companies can
make better operational and strategic
decisions.” 11

Seven Data Mesh


Use Case Examples
A successful Data Mesh fulfills use cases for
data mesh
Operational as well as Analytic Data domains.
App Modernization Stream Analytics
The following seven use cases illustrate the
breadth of capabilities that a Data Mesh
brings to enterprise data. Event Sourcing Streaming Ingest

Data Availability
Integration Data Pipelines
DATA MESH USE CASES APPLY TO
OPERATIONAL & ANALYTIC SYSTEMS

App Modernization Data Pipelines


Data Availability Stream Analytics
Streaming Ingest
Event Sourcing
Systems of Record (SoR) Systems of Analysis
Integration
Sources of Truth Decision Support
Data Providers / Producers Data Science
Core Business Processes Predictive Analytics
Systems of
Systems of Engagement Interchange Data Visualization
data mesh
Use Case Examples from eBook – Operational

Systems of Record (SoR)

Sources of Truth

Data Providers / Producers

Core Business Processes

Systems of Engagement
Use Case Examples from eBook – Integration & Analytics
Systems of Interchange

Systems of Analysis

Decision Support

Data Science

Predictive Analytics

Data Visualization
BENEFIT FROM A DATA MESH ON POINT-PROJECTS…
(Operational & Analytic use cases)

Application
Monoliths
Application
Microservices
Consumer Interfaces

App Modernization Data


Visualization
Streaming Ingest Data
Lake (house)
Event Sourcing ODS Data / Event
Services
Geo-Distributed Data

Data Pipelines SQL


Integration Data Access
Warehouse
Data Availability Edge Notebooks
/ML
Marts
Devices
Stream Analytics

Page 26 - Enterprise Data Mesh: Solutions, Use Cases and Case Studies
Data
P Mesh
A
ATTERN RCHETYPE

General seconds
Purpose { Data / Event
Event Services }

Systems of Analysis/Engagement
Ledger
IoT
IoT Events, Edge Platform
Stream
Operational Systems of Record

& Microservices Processing


& Analytics
Multi-model
MOM /
IPaaS
Database/s

SaaS /
App Events
& Sys Logs
Database Data
Event Lake (house)
Ledger

OLTP
milliseconds
Data
Warehouse

Governance – Security (distributed), Data Verification, Data Catalog, Registry, Policies

Serverless or Service Mesh (multi-cloud) Deployments


Data
U CSEMesh
I ASE NSTANTIATION
App Modernization

Event Sourcing

General seconds
Purpose { Data / Event
Event Services }

Systems of Analysis/Engagement
Ledger
IoT
IoT Events, Edge Platform
Stream
Operational Systems of Record

& Microservices Processing


& Analytics
Multi-model
MOM /
IPaaS
Database/s

SaaS /
App Events
& Sys Logs
Database Data
Event Lake (house)
Ledger

OLTP
milliseconds
Data
Warehouse

Governance – Security (distributed), Data Verification, Data Catalog, Registry, Policies

Serverless or Service Mesh (multi-cloud) Deployments


Data
U CSEMesh
I ASE NSTANTIATION
Data Availability

General seconds
Purpose { Data / Event
Event Services }

Systems of Analysis/Engagement
Ledger
IoT
IoT Events, Edge Platform
Stream
Operational Systems of Record

& Microservices Processing


& Analytics
Multi-model
MOM /
IPaaS
Database/s

SaaS /
App Events
& Sys Logs
Database Data
Event Lake (house)
Ledger

OLTP
milliseconds
Data
Warehouse

Governance – Security (distributed), Data Verification, Data Catalog, Registry, Policies

Serverless or Service Mesh (multi-cloud) Deployments


Data
U CSEMesh
I ASE NSTANTIATION
Stream Analytics

General seconds
Purpose { Data / Event
Event Services }

Systems of Analysis/Engagement
Ledger
IoT
IoT Events, Edge Platform
Stream
Operational Systems of Record

& Microservices Processing


& Analytics
Multi-model
MOM /
IPaaS
Database/s

SaaS /
App Events
& Sys Logs
Database Data
Event Lake (house)
Ledger

OLTP
milliseconds
Data
Warehouse

Governance – Security (distributed), Data Verification, Data Catalog, Registry, Policies

Serverless or Service Mesh (multi-cloud) Deployments


Data
U CSEMesh
I ASE NSTANTIATION
Streaming Ingest

General seconds
Purpose { Data / Event
Event Services }

Systems of Analysis/Engagement
Ledger
IoT
IoT Events, Edge Platform
Stream
Operational Systems of Record

& Microservices Processing


& Analytics
Multi-model
MOM /
IPaaS
Database/s

SaaS /
App Events
& Sys Logs
Database Data
Event Lake (house)
Ledger

OLTP
milliseconds
Data
Warehouse

Governance – Security (distributed), Data Verification, Data Catalog, Registry, Policies

Serverless or Service Mesh (multi-cloud) Deployments


Data
U CSEMesh
I ASE NSTANTIATION
Data Pipelines

General seconds
Purpose { Data / Event
Event Services }

Systems of Analysis/Engagement
Ledger
IoT
IoT Events, Edge Platform
Stream
Operational Systems of Record

& Microservices Processing


& Analytics
Multi-model
MOM /
IPaaS
Database/s

SaaS /
App Events
& Sys Logs
Database Data
Event Lake (house)
Ledger

OLTP
milliseconds
Data
Warehouse

Governance – Security (distributed), Data Verification, Data Catalog, Registry, Policies

Serverless or Service Mesh (multi-cloud) Deployments


Data
C Mesh
E
ONCRETE XAMPLES
Oracle & Hybrid
Oracle Cloud
(multi-cloud)
API Platform
Kinesis
-or-
IoT ExaCS & ExaCC
Event Hub GoldenGate
IoT
Cloud -or- Stream Athena,
OCI OCI Data Confluent Analytics Cosmos etc

Integration Streaming Science AWS SQS Data


GoldenGate OCI Data
Cloud Stream Platform etc. (in EMR –or- EMR, ADLS, Science
ADLS) Delta Lake etc.
Analytics Analytics
OCI Cloud PowerBI
GoldenGate Redshift,
GoldenGate Autonomous on compute Synapse, etc.
Data Warehouse Snowflake, etc

OCI Data Catalog & GoldenGate Stream Analytics Glue/Azure Catalog & GoldenGate Stream Analytics
Oracle Cloud Infrastructure Compute or Container Services

Open Source Noteworthy Technology Layers:


• IoT/Edge – for gateways, edge notes, telemetry collection
• Message-oriented Middleware – for event-driven business process integrations
Open- • Data Events/CDC – DB transaction events, full ACID consistency etc.
Remote
-or-
MySQL,
Postgres etc.
• Event Streams – scale-out and partitioned event store
WS02 / • Stream Processing, Analytics – stateful, windowed complex stream processing
RabbitMQ
Apache Spark • Security – distributed authentication/authorization across VCNs
Debezium etc. • Data Catalog / Registry – semantic alignment of entities, schemas, and registries
Apache Hive
• Data Verification – auditable verification of data consistency across data stores
Open Policy Agent, Egeria… • Serverless / Service Mesh – depending on public cloud or self-operated
Data
S C Mesh
M
INGLE LOUD OR ULTI-CLOUD
Single Cloud: Multi-Cloud:
• Simpler – fewer networks, identity domains, etc • Decentralized – introduces greater complexity (events, security, networking, etc)
• Serverless – more opportunities to use ‘pay per use’ services • Service Mesh & Serverless – IT may have to operate ‘as a service’ containers & K8S
• Homogeneous – major commitment to single vendor solutions • Heterogeneous – empowers best-of-breed and greater portability, reuse of services

vs. Data Mesh


Self-Service GUI
Stream Processing
Ingest

Data Lake House


Edge
Analytics
Edge Gateways

Enterprise
Applications

Stream Processing
Ingest

Athena
Exadata
Cloud@
Customer
Data Lake House RDS

Redshift

Page 34 - Enterprise Data Mesh: Solutions, Use Cases and Case Studies
Data
I Mesh
D P
NSTANTIATING ATA RODUCTS
Master / Prepared Data Service

Raw Event Stream (App Events)

Multi-model Data Sets (Databases)

Analytic Dashboards (Streaming Data)

Raw Data Sets (Historic/Archive)

Curated Data Sets (Data Lake)

Raw Event Stream (Data Events)

Analytic Dashboards (Stored Data)

Curated Data Sets (Data Warehouse)


Note the obvious  Data Mesh is not for everyone!
Centralized “Hub style” Solutions Decentralized / “Multi”

Enterprise Data Mesh


Connect operational Systems of
Enterprise DL Record (SoR’s) to systems of
Analysis. Can be multi-region,
Augment DW or create a cloud
multi-cloud, multi-lake etc. Default
native data lake that allows
architecture mode is event-driven,
exploring new sources of data with
Modern DW real-time, stream integration etc.
larger volumes that can be
processed at scale and in real time.
Move/create your DW in the cloud
to optimize costs, to have real time Trusted Data Mesh
visibility into business and extract
• Support data arch across many
Departmental more value out of governed data.
different cloud vendors
Move/create your mart in the cloud • + Data Fabric
to optimize costs, be more agile, Data Lake
• + Data Mesh
have less effort and be able to grow • Move to cloud or evolve EDW • + Real time data pipelines with
easily. • + new sources infused intelligence
Enterprise Data Warehouse • + scale out processing • + Edge computing ‘hooks’
• + stream analytics • + Microservices design patterns
• Move to Cloud or evolve
• + real time data pipelines
Departmental Analytics
• + real time ingestion
Departmental Analytics
• + Infused ML & AI
• Move to Cloud • + data virtualization
• + cloud economics • + governance
• + cost optimization
• + scalability
• + governance
• + agility

36 Copyright © 2021, Oracle and/or its affiliates


Which Oracle technology to lead with???

Converged, multi-
model Database Stream Analytics

Apex & other self- GoldenGate


service tools

Integration Cloud /
Oracle Data Science SOA
Platform

Oracle Cloud
Oracle App-Dev Tools
Infrastructure
for Microservices
Differentiation
T 3OOP RACLE DATA MESH STRENGTHS

1
Oracle highlights… …the other guys
• #1 vendor for real-time data fabric • Eg; open source tools like Kafka are not
DECENTRALIZED • comprehensive platform for all real-time and transactionally consistent
DATA EVENTS streaming data needs • Eg; other vendor tools for replication
• only vendor to focus on full Tx consistency aren’t HA/DR solutions
• first to fully shift to microservices

2
• Oracle is only hyperscale cloud vendor with • Eg; AWS is total data ‘lock-in’
balanced commitment to on-premise, cloud at • Eg; Azure & GCP are weak for on-prem,
HYBRID DATA customer and support for other clouds multi-cloud data mgmt.
MANAGEMENT • Means you can run your data on any infra with full • Eg; niche vendors can’t help with whole
support from Oraclze data mgmt. software or data estate
cloud services

3 MULTI-MODEL
DATABASE



Gartner recognized #1 database
ideal for ‘data product’ consumption
optimize for legacy or modern applications &
microservices
simplified management & lifecycle of data assets
in many (data consumer) formats
• Eg; AWS requires 15+ different DB
engines for multi-model
(different security models, SQL semantics,
admin, lifecycle, etc.)

Page 38 - Enterprise Data Mesh: Solutions, Use Cases and Case Studies
Integration Events with GoldenGate
Event Stream Data Pipelines Data Transformation GoldenGate Integrations Time Series Analysis Geo-Fencing Predictive Analytics

Processing

Non-Relational Data Lake Ingest Streaming Ingest Cloud Ingest Messaging Replication NoSQL Replication SaaS Replication

Integrations

DB Event Unidirectional Bi-Directional Peer-to-Peer Broadcast Consolidation Distribution

Replication

Copyright © 2021 Oracle and/or its affiliates. 39


Case Study Examples from eBook – Operational
Demo

41 Copyright © 2021, Oracle and/or its affiliates | Confidential: Internal/Restricted/Highly Restricted


Useful links

Data mesh eBook


[Link]

LinkedIn Blog series


[Link]
pollock/?trackingId=et2xhDIRZtL7Uks9xPOqMg%3D%3D

Enterprise Data Mesh and GoldenGate


[Link]

Golden Gate Stream Analytics Workshop hands-on


[Link]

Video youtube
[Link]
[Link]

42 Copyright © 2021, Oracle and/or its affiliates

You might also like