You are on page 1of 37

Build modern

applications with
purpose-built databases
Justin Thomas
AWS Principal Neptune Specialist

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda

• New data requirements for modern applications


• Why purpose-built databases?
• Picking the right tool for the job
• Resources

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
New data requirements
for modern applications

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Rapid expansion of data requirements

Explosion of data Microservices change data Accelerated rate of change


and analytics requirements driven by DevOps

Dev Ops

Data grows 10x every 5 years Microservices architecture Transition from IT to


driven by network- decreases need for “one size fits all’ DevOps increases rate of
connected smart devices databases and increases need for change
real-time monitoring and analytics

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Building modern applications with purpose-built
databases

Relational Key-value Document In-memory Graph Time-series Ledger Wide Column

Referential High Store documents Query by key Quickly and easily Collect, store, Complete, Scalable, highly
integrity, ACID throughput, and quickly with create and and process immutable, and available, and
transactions, Low latency access querying microsecond navigate data sequenced verifiable history managed Apache
schema- reads and writes, on any attribute latency relationships by time of all changes to Cassandra-
on-write endless scale between data application data compatible service

AWS
Service(s) ElastiCache &
Aurora RDS DynamoDB DocumentDB Neptune Timestream QLDB Keyspaces
MemoryDB Managed Cassandra

Lift and shift, Real-time Content Caching, session Fraud detection, IoT Systems Build low-latency
Common
Use ERP, CRM, bidding, management, management, social applications, of record, applications,
Cases finance shopping cart, personalization, gaming networking, event tracking supply chain, leverage open
social, product mobile leaderboards, recommendation health care, source, migrate
catalog, geospatial engine registrations, Cassandra to the
customer applications financial cloud
preferences

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The Walt Disney Company leverages AWS’s proven global
infrastructure to improve performance and reliability for
Disney+, as one of the largest online streaming services
continues to scale globally.


AWS has been our preferred cloud provider for
years, and its proven global infrastructure and “
• Preferred cloud provider: The Walt Disney Company relies
on AWS as its preferred public cloud infrastructure provider
to support the explosive growth of Disney+, which quickly
expansive suite of services has contributed reached its five-year goal of registering 100 million
subscribers in 59 countries, 9 months after launch
meaningfully to the incredible success of Disney+.
• Purpose-built: Disney+ uses more than 50 AWS technologies,
—Joe Inzerillo like machine learning, database, storage, content delivery,
Executive Vice President & CTO, Direct-to-Consumer serverless, and analytics, including Amazon Kinesis,
The Walt Disney Company Amazon DynamoDB, and Amazon Timestream, for
quality streaming experiences

• Performance at scale:” Disney+ handled three billon


requests for Disney+ content in the first three days post
launch. It also successfully managed huge peaks in demand
for premium content, like “Hamilton,” “Mulan,” and “The
Mandalorian

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Picking the right
tool for the job

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Relational data
Patient
* Patient ID
First name
Divide data among tables Last name
Gender
Highly structured DOB *
Visit
Visit ID

Relationships established via * Doctor ID * Patient ID

* Hospital ID
keys enforced by the system Date
Doctor
Data accuracy and consistency * Doctor ID * Treatment ID

First name
Last name
Medical specialty Medical Treatment
* Hospital affiliation * Treatment ID
Procedure
How performed
Hospital Adverse outcome
* Hospital ID
Contraindication
Name
Address
Rating

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Aurora

MySQL and PostgreSQL-compatible relational database built for the cloud

Performance Availability Highly Fully


& scalability & durability secure managed
5x throughput of standard Fault-tolerant, self-healing Network isolation, Managed by Amazon RDS:
MySQL and 3x of standard storage; 6 copies of encryption at On your part, no server provisioning,
PostgreSQL; scale out up data across 3 AZs; rest / in transit software patching, setup,
to 15 read replicas continuous backup to Amazon S3 configuration, or backups

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
A leader in 3D design, engineering, and entertainment
software, Autodesk makes software for people who make
things

Challenge:
Their ACM (Access Control Management) app connected to a
single RDS MySQL database instance, limiting the available
scaling options

Solution:
They chose Amazon RDS for MySQL first, and migrated to
Aurora for improved scalability, availability, and performance

Result:
• Aurora provides up to five times the throughput of
standard MySQL databases
• Aurora scales up to 64 TB, with no need to scale manually
• Up to 15 low-latency Aurora Replicas improve availability

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Key–value data

PUT {
Simple key–value pairs
TableName:"Gamers",
Partitioned by keys Item: {
Gamers
Resilient to failure "GamerTag":"Hammer57",
Primary key Attributes
High-throughput, low- "Level":21,
Gamer tag Level Points High score Plays
"Points":4050,
latency reads and writes Hammer57 21 4050 483610 1722
"Score":483610,
Consistent performance "Plays":1722
FluffyDuffy 5 1123 10863 43
at scale } }
Lol777313 14 3075 380500 1307
Jam22Jam 20 3986 478658 1694
ButterZZ_55 7 1530 12547 66
… … … … …

GET {
TableName:"Gamers",
Key: {
"GamerTag":"Hammer57“,
“ProjectionExpression“:”Points”
} }

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon DynamoDB
Fast and flexible key value database service for any scale

Performance Serverless architecture Enterprise Global replication


at scale security

Consistent, single-digit No hardware provisioning, Encrypts all data by Build global applications
millisecond response times at software patching, or upgrades; default and fully integrates with fast access to local data
any scale; build applications scales up or down automatically; with AWS Identity and by easily replicating tables
with virtually unlimited continuously backs up your data Access Management for across multiple
throughput robust security AWS Regions

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Dropbox saves millions by building
scalable metadata store on AWS

Challenge:
Dropbox had a capacity crunch in its on-premises metadata
store due to fast data growth in some partitions. It had three
options: double its on-premises storage, delete swaths of
metadata, or find a new solution.

Solution:
Dropbox turned to Amazon DynamoDB and Amazon Simple
Storage Service for its new managed storage system, which
saved the company millions of dollars and has room for
virtually unlimited user metadata.

Results:
• Launched metadata storage system on AWS in 1 year
• Cut cost per user gigabyte by a factor of 5.5
• Ingests data at 4,000–6,000 queries per second

Amazon
© 2021, Amazon
Amazon Web Services, Inc. or its affiliates. All rights reserved.
DynamoDB S3
Wide column: Apache Cassandra

• Open-source, wide-column data store


• Large scale applications that require fast read and write performance
• Use cases:
• User profiles
• Device metadata
• Time-series data
• Transaction logging

• Cassandra Query Language (CQL)

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Keyspaces (for Apache Cassandra)
Scalable, highly available, and managed Apache Cassandra-compatible database
service

Single-digit
Apache Cassandra- No servers to millisecond Highly available and
compatible manage performance at scale secure

Use the same Cassandra No need to provision, Scale tables up and down 99.99% availability SLA
drivers and tools configure, and operate automatically within an AWS Region
large Cassandra
clusters Virtually unlimited throughput Data encrypted at rest;
and storage integrated with IAM

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Check-In: A PwC product

Pseudonymized
contact tracing
PwC
Links to
employees only
to determine
potential
exposure

Mobile application Backend tech Dashboard

Users either install a mobile app Associations between signals Authorized users input name of
on their phones or carry a observed by anonymized impacted employee; ACT
hardware device such as a phones enables proximity instantly provides a list of
beacon tracing between employees potentially at-risk colleagues

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Why use a document database?

The JSON document model maps naturally to application data

Each document can have a different data structure and


is independent of other documents

Index on any key in a document, and run ad hoc and


aggregation queries across your data set

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon DocumentDB
Fast, scalable, and fully managed MongoDB-compatible database service

Fast Scalable Fully managed MongoDB compatible

Millions of requests per second Separation of compute and Managed by AWS: Compatible with MongoDB 3.6
with millisecond latency storage scales both no hardware provisioning; and 4.0; use the same SDKs,
independently; scale out to auto patching, quick setup, tools, and applications with
15 read replicas in minutes secure, and automatic Amazon DocumentDB. Migrate
backups workloads with AWS DMS.

Purpose-built document database


engineered for the cloud
© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
FINRA better manages millions of documents with DocumentDB

FINRA processes 4 million filings and filing


attachments per year. The regulator’s existing
solution to store millions of documents as
XML in a relational database was consuming
too much storage, required custom tooling,
and was difficult to manage.

FINRA chose Amazon DocumentDB (with


MongoDB compatibility) as a managed JSON
document store, making it simpler to query


Amazon and index regulatory documents, reduce
One of the reasons that we picked a managed service is we DocumentDB development cycles, and extend usability of
are big in requirements around security, audit, versioning,
data.
records management. All of these are very important to us,
so by going to a managed service like using
DocumentDB…we get all of these things free for us so that
we do not have to worry about building a lot of this by Unlike in a relational DB, the storage and


ourselves. compute are decoupled in DocumentDB,
which enables FINRA’s databases to scale
- Ranga Rajagopal, Senior Director, Enterprise Data Platforms, automatically, independently and quickly,
FINRA regardless of the size of their data.

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
In-memory databases: usage patterns

Caching Real-time Gaming Geospatial Media


analytics store leaderboards streaming store

Session Chat apps Job Machine learning


store pub/sub queue real-time model scoring

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The Pokémon Company migrates
to AWS purpose-built databases

Challenge:
The Pokémon Company International wanted to address the
complexity of managing NoSQL and memory-caching systems
for a database of more than 300 million users.

Solution:
The company uses Amazon Aurora as its main user database,
Amazon DynamoDB to reduce bot login attempts, and Amazon
ElastiCache to enable dynamic caching of user logins.

Results:
• Cuts monthly costs by tens of thousands of dollars
• Reduces number of nodes from 300 to 30
• Experiences zero hours of downtime or performance
degradation after migration

Amazon Amazon Amazon


Aurora DynamoDB ElastiCache

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What is a Graph?

• Graph databases are purpose-built to store and navigate relationships.

• Nodes represent real-world objects

• Edges store relationships between objects

• Properties and Labels can be added to both Nodes and Edges

Node Node
Edge

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
A Social Network is a Graph

• Nodes: People

• Edges: Friend Relationships

• Edges can have direction

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Graph Query languages- Designed to move
through data
Graphs query languages are optimized to use connections to move through a network

Jack Annie Sam


Friend Friend

Relational queries work by combining sets of data, via joins or recursive CTE
Person Friends Friends of a Friend
Perso
Perso Name Person Person Person
Friend Person Person Person n Friend Person Person Person
nID Name Name Name
ID ID1 Name1 ID2 Name ID ID1 ID2 ID3
P1 Jack X 2 = 1 2 3

E1 P1 Jack P2 Annie E4 P1 Jack P2 Annie P3 Sam


P2 Annie
E2 P2 Annie P3 Sam
P3 Sam
© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Why use a graph?

Social Knowledge Fraud Life Network & IT


Recommendations operations
networking graphs detection Sciences

With connected datasets


• Where relationships between data matter just as much as the data itself
• When results are dependent on the basis of the strength, weight, or quality of relationships

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Neptune
Fast, reliable graph database built for the cloud

OPEN FAST RELIABLE EASY

Supports Apache Query billions of 6 replicas of your data Build powerful


TinkerPop, relationships with across 3 AZs with full queries easily with
openCypher, W3C millisecond latency backup and restore Gremlin and SPARQL
RDF graph models

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Siemens is a global powerhouse focusing on the areas
of electrification, automation, and digitalization

Challenge:
They were faced with isolated data silos from different
departments that resulted in data inaccessibility,
inefficient workflows, and low data quality

Solution:
• Industrial knowledge graphs for capturing
Siemens Domain Knowledge
• Providing knowledge graphs as a service

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Ledgers and audit tables have
been around for a while

Banking and finance Manufacturing Ownership


Keeping track of transactions, Recording components Maintaining records of asset
trades, and accounts used in manufacturing ownership or provenance

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Quantum Ledger Database
Fully managed ledger database
Track and verify history of all changes made to your application’s data

Immutable and transparent Cryptographically Highly scalable Easy to use


verifiable

Append-only, immutable All changes are Executes 2 – 3X as many Flexible document model,
journal tracks history of all cryptographically transactions than ledgers in query with familiar SQL-
changes which cannot be chained and verifiable common blockchain like interface
deleted or modified. Get frameworks
full visibility into entire
data lineage

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
BMW
Digital Vehicle Passport
Challenge
BMW needs to track trusted, verifiable
automotive data in order to get full
transparency on transactions across
multiple entities.

Solution
Started building a BMW Digital Vehicle
Passport App that provides a transparent and
complete history of vehicle data such as
fueling, inspection, oil changes, diagnostics,
repairs, tire changes, and sales across multiple
partners. Amazon QLDB is at the core of this
solution and provides BMW the verified data
with a centralized trust

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Time series data

Time-series data is
a sequence of data
points recorded over
a time interval for
measuring events that
change over time

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Timestream
Fast, scalable, fully managed time series database

1,000x faster and 1/10th the Trillions of


cost of relational databases daily events Time-series analytics Serverless

Collect data at the rate of Adaptive query processing Built-in functions for Automated setup,
millions of inserts per engine maintains steady, interpolation, smoothing, configuration, server
second (10M/second) predictable performance and approximation provisioning, software patching

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customer use case

Trimble Inc., is a leading technology


provider of productivity solutions for the
construction, agriculture, geospatial, and
transportation industries.

“Whenever possible, the Trimble Cloud Platform


team tries to leverage AWS’ managed service
offerings. We are excited to now use Amazon
Timestream as a serverless time series database
supporting our IoT monitoring solution.
Timestream is purpose-built for our IoT-
generated time series data, and will allow us to
reduce management overhead, improve
performance, and reduce cost for our existing
monitoring system.” David Kohler, Engineering
Director – Trimble
© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Conclusion

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Our approach

Architect services Offer a portfolio Help you innovate Provide services that
ground-up for the of purpose-built faster through help you migrate existing
cloud and for the services, optimized managed services apps and databases
explosion of data for your workloads to the cloud

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Accelerate modernization with continuous learning
Training created by the experts at AWS

Free digital courses, including: Earn an industry-recognized credential:


Architecting Serverless Solutions AWS Certified Developer – Associate
Getting started with DevOps on AWS AWS Certified DevOps – Professional

Hands-on classroom training Create a self-paced learning roadmap


(available virtually) including: AWS Developer Ramp-up guide
Running Containers on Amazon Elastic AWS DevOps Ramp-up guide
Kubernetes Service (Amazon EKS)
Advanced Developing on AWS

Learn more about


Take Developer
training for you and your
and DevOps
team
training today

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!

© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.

You might also like