Data Mash - New Paradigm: Ristian Necula
Data Mash - New Paradigm: Ristian Necula
CRISTIAN NECULA
TECHNOLOGY ARCHITECT
November 2021
Agenda
But these initiatives can be very, very difficult … and many will fail. Oracle,
and the emerging Data Mesh approach can help.
…facing difficult odds
Because the old ways are not working well. Most business
transformation initiatives fail. Most of the time and costs for
digital platforms are sunk into ‘integration’ efforts 4. The old
monolithic tech architectures of the past are cumbersome,
expensive, and inflexible. Additionally:
Data mess!
• 70-80% of digital transformations fail
• Cloud lock-in is real, and can become very costly
• Data Lakes rarely succeed, and are only analytics focused
• Organizational silos exacerbate issues, causing
unnecessary problems with cross-functional programs
• Everything is speeding up, the pace of innovation, the
speed of IT events, your competition…etc
• Cost of operational data outage is rising
Real-time
In some cases,
may not support
strong consistency
This is today’s
topic!
Data Mesh
Always supports
trusted, strong
consistency Enterprise Data Fabric
Batch
Centralized Data Decentralized Data
7 Copyright © 2021, Oracle and/or its affiliates
Evolution towards Real-Time Data Mesh
Monoliths Distributed
Industry 3.0: Hub and Spoke Transitional: Kappa Hub Mature: Distributed Kappa
mesh & microservice controls
ETL
ETL
ETL
ETL
Lambda: [Link]
[Link]
Kappa: [Link]
This data pattern, popularized by Ralph By 2010, the Lambda (big data) pattern By 2020, IT infrastructure has
Kimball and Bill Inmon, has been the was common. In 2014, Jay Kreps (of dramatically changed – networking,
foundation for enterprise data LinkedIn) questioned the Lambda containers, cloud, compute, IoT etc have
management since 1993. Architecture and spawned Kappa. all pushed data to the edge.
It is transaction consistent, can scale up The Kappa principles consider batch A mature Kappa architecture is not a
nicely for most use cases, and is based on processing as a special case of stream single instance “hub” but rather a
SQL, lingua-franca for most tools. processing. Use a historized event log to distributed mesh of data logs, stream
process both real-time as well as batch data processing, change events, and time
processing. series data.
Copyright © 2020 Oracle and/or its affiliates. 8
Beware the hype…
Since Data Mesh is a rising hot topic and still in the early days of maturity,
there may be some marketing content that uses the words “data mesh” but
the described solutions do not actually fit the core approach.
microservices
Apps
app / IoT
events
1 2 3 4 5 6 7
ACID
data
events 1 2 3 4 5 6 7
API driven, secure service mesh (K8S, Docker etc) that runs anywhere
data product KPIs, SLAs | business domain modeling | provenance & explainability
11
1 DATA PRODUCT
THINKING
4
different from commonplace solutions that
have already been around for decades.
Page 21 - Enterprise Data Mesh: Solutions, Use Cases and Case Studies
“By integrating real-time operational
data and analytics, companies can
make better operational and strategic
decisions.” 11
Data Availability
Integration Data Pipelines
DATA MESH USE CASES APPLY TO
OPERATIONAL & ANALYTIC SYSTEMS
Sources of Truth
Systems of Engagement
Use Case Examples from eBook – Integration & Analytics
Systems of Interchange
Systems of Analysis
Decision Support
Data Science
Predictive Analytics
Data Visualization
BENEFIT FROM A DATA MESH ON POINT-PROJECTS…
(Operational & Analytic use cases)
Application
Monoliths
Application
Microservices
Consumer Interfaces
Page 26 - Enterprise Data Mesh: Solutions, Use Cases and Case Studies
Data
P Mesh
A
ATTERN RCHETYPE
General seconds
Purpose { Data / Event
Event Services }
Systems of Analysis/Engagement
Ledger
IoT
IoT Events, Edge Platform
Stream
Operational Systems of Record
SaaS /
App Events
& Sys Logs
Database Data
Event Lake (house)
Ledger
OLTP
milliseconds
Data
Warehouse
Event Sourcing
General seconds
Purpose { Data / Event
Event Services }
Systems of Analysis/Engagement
Ledger
IoT
IoT Events, Edge Platform
Stream
Operational Systems of Record
SaaS /
App Events
& Sys Logs
Database Data
Event Lake (house)
Ledger
OLTP
milliseconds
Data
Warehouse
General seconds
Purpose { Data / Event
Event Services }
Systems of Analysis/Engagement
Ledger
IoT
IoT Events, Edge Platform
Stream
Operational Systems of Record
SaaS /
App Events
& Sys Logs
Database Data
Event Lake (house)
Ledger
OLTP
milliseconds
Data
Warehouse
General seconds
Purpose { Data / Event
Event Services }
Systems of Analysis/Engagement
Ledger
IoT
IoT Events, Edge Platform
Stream
Operational Systems of Record
SaaS /
App Events
& Sys Logs
Database Data
Event Lake (house)
Ledger
OLTP
milliseconds
Data
Warehouse
General seconds
Purpose { Data / Event
Event Services }
Systems of Analysis/Engagement
Ledger
IoT
IoT Events, Edge Platform
Stream
Operational Systems of Record
SaaS /
App Events
& Sys Logs
Database Data
Event Lake (house)
Ledger
OLTP
milliseconds
Data
Warehouse
General seconds
Purpose { Data / Event
Event Services }
Systems of Analysis/Engagement
Ledger
IoT
IoT Events, Edge Platform
Stream
Operational Systems of Record
SaaS /
App Events
& Sys Logs
Database Data
Event Lake (house)
Ledger
OLTP
milliseconds
Data
Warehouse
OCI Data Catalog & GoldenGate Stream Analytics Glue/Azure Catalog & GoldenGate Stream Analytics
Oracle Cloud Infrastructure Compute or Container Services
Enterprise
Applications
Stream Processing
Ingest
Athena
Exadata
Cloud@
Customer
Data Lake House RDS
Redshift
Page 34 - Enterprise Data Mesh: Solutions, Use Cases and Case Studies
Data
I Mesh
D P
NSTANTIATING ATA RODUCTS
Master / Prepared Data Service
Converged, multi-
model Database Stream Analytics
Integration Cloud /
Oracle Data Science SOA
Platform
Oracle Cloud
Oracle App-Dev Tools
Infrastructure
for Microservices
Differentiation
T 3OOP RACLE DATA MESH STRENGTHS
1
Oracle highlights… …the other guys
• #1 vendor for real-time data fabric • Eg; open source tools like Kafka are not
DECENTRALIZED • comprehensive platform for all real-time and transactionally consistent
DATA EVENTS streaming data needs • Eg; other vendor tools for replication
• only vendor to focus on full Tx consistency aren’t HA/DR solutions
• first to fully shift to microservices
2
• Oracle is only hyperscale cloud vendor with • Eg; AWS is total data ‘lock-in’
balanced commitment to on-premise, cloud at • Eg; Azure & GCP are weak for on-prem,
HYBRID DATA customer and support for other clouds multi-cloud data mgmt.
MANAGEMENT • Means you can run your data on any infra with full • Eg; niche vendors can’t help with whole
support from Oraclze data mgmt. software or data estate
cloud services
3 MULTI-MODEL
DATABASE
•
•
•
•
Gartner recognized #1 database
ideal for ‘data product’ consumption
optimize for legacy or modern applications &
microservices
simplified management & lifecycle of data assets
in many (data consumer) formats
• Eg; AWS requires 15+ different DB
engines for multi-model
(different security models, SQL semantics,
admin, lifecycle, etc.)
Page 38 - Enterprise Data Mesh: Solutions, Use Cases and Case Studies
Integration Events with GoldenGate
Event Stream Data Pipelines Data Transformation GoldenGate Integrations Time Series Analysis Geo-Fencing Predictive Analytics
Processing
Non-Relational Data Lake Ingest Streaming Ingest Cloud Ingest Messaging Replication NoSQL Replication SaaS Replication
Integrations
Replication
Video youtube
[Link]
[Link]