You are on page 1of 14

Cloud Architecture Best Practices:

Using the Right Tools

KENTIK WHITEPAPER
B uilding a cloud application is somewhat like building a house. You want a design that’s unique and
beautiful, but you also want to follow best practices to make sure your infrastructure is correctly built
— e.g. the foundation and framing are solid; plumbing and electrical systems are properly planned; and
the roofing, windows, and doors are all installed in the right way. In the construction industry, principles
to ensure a building is constructed effectively are documented in the International Building Code. In
the realm of cloud infrastructure, Amazon has pioneered the AWS Well-Architected Framework. The
Framework is something like a building code, established as a set of best practices for cloud architecture.

If you’re building a cloud application, the Well-Architected Framework is a great place to start. It’s also a
reminder that if you want a secure, high-performing, resilient, and efficient infrastructure, you must have
the right tools in place. Otherwise, many of the costs, simplicity, and agility benefits of cloud applications
won’t be achieved. That’s why we created this white paper. In it, we will:

• Outline the AWS Well-Architected Framework: Principles, “Five Pillars,” and best practices
• Discuss how the Framework applies to networking — a critical part of cloud architecture
• Highlight how Kentik can help you effectively implement these best practices

What is the AWS Well-Architected Framework?


The AWS Well-Architected Framework is a series of design principles, best practice recommendations,
and questions to ask when designing, building, and evolving cloud architectures in order to consistently
deliver the best possible customer experience. The target audience for this document is CTOs, architects,
developers, and operations team members — all the stakeholders involved in organizations’ cloud
infrastructure.

One common mistake during cloud infrastructure development is failing to consider your organization’s
unique requirements (e.g. business priorities, budget constraints, expertise, etc.) and instead blindly
copying from best practice examples. The fact is, there are a variety of good solutions, even within the
same vertical or application type. You must factor in your organization’s real needs, review all available
options on the table, and customize a solution based on those facts. Using the AWS Framework, you can:

• Ask all the right questions with a checklist format so you won’t miss anything critical
• Understand the pros and cons of architectural options from industry experts
• Weigh the tradeoffs and make final decisions
The AWS Well-Architected Framework provides a checklist that cloud stakeholders can follow, in whole or
in part.

(We will use “the Framework” to refer to the “AWS Well-Architected Framework” for the rest of this article.)

© 2 019 K E N T I K • C L O U D A R C H I T E C T U R E B E S T P R A C T I C E S PA G E 1
General Design Principles
A principle is a concept that guides a behavior or evaluation. For anything that involves complexity, it’s
important to understand some basic principles, which are the summarized knowledge and experiences
of expert practitioners. Therefore, the very first thing the Framework talks about are the six high-level
principles that facilitate overall good design in the cloud:
• Stop guessing your capacity needs • Allow for evolutionary architectures
• Test systems at production scale • Drive architectures using data
• Automate to make architectural • Improve through game days/simulations
experimentation easier

In addition to these general principles, the Framework also breaks best practices down into Five Pillars,
with more specific principles for each.

The Five Pillars


The core of the Framework is the Five Pillars, which are the foundational building blocks of well-
architected cloud infrastructure: Operational Excellence, Security, Reliability, Performance Efficiency, and
Cost Optimization.

For each pillar, the Framework provides design principles, followed by a list of questions to ask in order
to implement the best practices. The Five Pillars are a starting point to build cloud applications using a
structured approach.

We will go through each pillar and discuss how Kentik can help in certain key areas.

The AWS Well-Architected Framework: Five Pillars


PILLAR 1 - OPERATIONAL EXCELLENCE: The ability to run and monitor systems to deliver business value
and to continually improve supporting processes and procedures.

PILLAR 2 - SECURITY: The ability to protect information, systems, and assets while delivering business
value through risk assessments and mitigation strategies.

PILLAR 3 - RELIABILITY: The ability of a system to recover from infrastructure or service disruptions,
dynamically acquire computing resources to meet demand, and mitigate disruptions such as
misconfigurations or transient network issues.

PILLAR 4 - PERFORMANCE EFFICIENCY: The ability to use computing resources efficiently to meet system
requirements, and to maintain that efficiency as demand changes and technologies evolve.

PILLAR 5 - COST OPTIMIZATION: The ability to avoid or eliminate unneeded cost or suboptimal resources.

© 2 019 K E N T I K • C L O U D A R C H I T E C T U R E B E S T P R A C T I C E S PA G E 2
PILLAR 1

Operational Excellence
The operational excellence pillar is about how to run and
monitor systems that deliver business value and how to
continuously improve supporting processes and procedures.

BEST PRACTICE QUESTIONS TO ASK

• How do you determine what your priorities are?


• How do you design your workload so that you can
understand its state?
PREPARE • How do you reduce defects, ease remediation, and improve
flow into production?
• How do you mitigate deployment risks?
• How do you know that you are ready to support a workload?

• How do you understand the health of your workload?


OPERATE • How do you understand the health of your operations?
• How do you manage workload and operations events?

EVOLVE • How do you evolve operations?

© 2 019 K E N T I K • C L O U D A R C H I T E C T U R E B E S T P R A C T I C E S PA G E 3
How Kentik supports the OPERATIONAL EXCELLENCE pillar:
• The Kentik platform can deliver historical and real-time insightful data that operations teams need
throughout the full cycle of Prepare → Operate → Evolve.
• Kentik provides network observability and raises operations teams’ situational awareness to increase
understanding of how operations’ choices impact workloads.
• Kentik provides the network metrics and views that allow ops teams to understand business and
customer needs and better measure their work achievement and business outcome.
• Kentik enables operations as code. With Kentik’s “API first” philosophy, you can perform provisioning,
query actions, and many other operations as code.

Dashboard showing an OPs view of cloud traffic trends and overview

© 2 019 K E N T I K • C L O U D A R C H I T E C T U R E B E S T P R A C T I C E S PA G E 4
PILLAR 2

Security
The security pillar is about protecting information,
systems, and assets while delivering business value
through risk assessment and mitigation strategies.

BEST PRACTICE QUESTIONS TO ASK

• How do you manage credentials and authentication?


IDENTITY AND ACCESS
• How do you control human access?
MANAGEMENT
• How do you control programmatic access?

• How do you detect and investigate security events?


DETECTIVE CONTROLS
• How do you defend against emerging security threats?

INFRASTRUCTURE • How do you protect your networks?


PROTECTION • How do you protect your compute resources?

• How do you classify your data?


DATA PROTECTION • How do you protect your data at rest?
• How do you protect your data in transit?

INCIDENT RESPONSE • How do you respond to an incident?

© 2 019 K E N T I K • C L O U D A R C H I T E C T U R E B E S T P R A C T I C E S PA G E 5
How Kentik supports the SECURITY pillar:
Enabling traceability, which is defined as “monitoring, alerting, and auditing actions and changes to your
environment in real time.” Kentik’s unique big data engine retains and baselines months of unsummarized
network traffic data, and continuously compares historical traffic with current activity to detect behavioral
changes like potentially compromised instances. This detection capability is automated rather than
manual.

• To protect data in transit, Kentik can detect insecure protocols to protect your data (e.g. discover
applications using HTTP instead of HTTPS).
• In order to keep people away from data, Kentik can detect if any unusual source or user is accessing
sensitive resources, such as databases.
• For detective controls, Kentik policies can compare past network behavior, including traffic type,
sources, and destinations, to identify emerging security threats.
• Kentik consumes VPC Flow Logs including ACCEPT & REJECT firewall actions. This information can be
used to understand whether the infrastructure is well protected.
• Kentik’s powerful ad-doc querying capability provides instant insight into what talks to what, enabling
fast and accurate incident response.

GEO view for rejected flows shows whether the global infrastructure is well protected

© 2 019 K E N T I K • C L O U D A R C H I T E C T U R E B E S T P R A C T I C E S PA G E 6
PILLAR 3

Reliability
The reliability pillar is about the ability of a system to recover
from infrastructure or service disruptions, dynamically acquire
computing resources to meet demand, and mitigate disruptions
such as misconfigurations or transient network issues.

BEST PRACTICE QUESTIONS TO ASK

• How do you manage service limits?


FOUNDATIONS
• How do you manage your network topology?

• How does your system adapt to changes in demand?


CHANGE MANAGEMENT • How do you monitor your resources?
• How do you implement change?

• How do you backup data?


• How does your system withstand component failures?
FAILURE MANAGEMENT
• How do you test resilience?
• How do you plan for disaster recovery?

© 2 019 K E N T I K • C L O U D A R C H I T E C T U R E B E S T P R A C T I C E S PA G E 7
How Kentik supports the RELIABILITY pillar:
• Reliability consists of a well-planned foundation + monitoring + mechanism of handling changes
and failure. Kentik proactively monitors for new applications consuming internet, cloud interconnect,
and inter-region VPC traffic, allowing teams to nip misrouted traffic in the bud before it causes
application performance, reliability, or cost problems.
• Kentik allows teams to better manage network topology. By understanding how traffic flows between
services in real-time, teams can make better architecture decisions.
• Kentik eliminates the guesswork during incident response. By showing how traffic volume or
distribution has changed after an incident has started, teams can quickly understand whether other
services are affected and rapidly determine the root cause.
• During change management, teams can look at traffic flow before and after the change to verify
there was no impact (or an expected impact) on traffic flow.

Real-time traffic for top destination ports eliminates the guesswork

© 2 019 K E N T I K • C L O U D A R C H I T E C T U R E B E S T P R A C T I C E S PA G E 8
PILLAR 4

Performance Efficiency
The performance efficiency pillar is about using computing
resources efficiently to meet system requirements and to maintain
that efficiency as demand changes and technologies evolve.

BEST PRACTICE QUESTIONS TO ASK

• How do you select the best performing architecture?


• How do you select your compute solution?
SELECTION • How do you select your storage solution?
• How do you select your database solution?
• How do you configure your networking solution?

• How do you evolve your workload to take advantage of


REVIEW
new releases?

• How do you monitor your resources to ensure they are


MONITORING
performing as expected?

TRADEOFFS • How do you use tradeoffs to improve performance?

© 2 019 K E N T I K • C L O U D A R C H I T E C T U R E B E S T P R A C T I C E S PA G E 9
How Kentik supports the PERFORMANCE EFFICIENCY pillar:
• To assure performance efficiency, Kentik can monitor service, application and network
performance and generate alarms so that you can remediate any issues before customers are
impacted.
• Different applications have different latency and throughput requirements. Kentik can measure
these metrics using enriched flow data to help cloud adopters choose the right network components
based on each application’s requirements.
• Kentik can also answer performance questions allowing teams to monitor resources to ensure they
are performing as expected. Understanding the network behavior of cloud-deployed applications is
still a critical part of ensuring their availability and performance.
• Understanding performance at the level of individual microservices can pinpoint the spot where
changes are needed to optimize applications.

A Sankey graph of all communications among individual microservices in real-time

© 2 019 K E N T I K • C L O U D A R C H I T E C T U R E B E S T P R A C T I C E S PA G E 10
PILLAR 5

Cost Optimization
The cost optimization pillar is about the ability to run
systems to deliver business value at the lowest price point.

BEST PRACTICE QUESTIONS TO ASK

• How do you govern usage?


EXPENDITURE
• How do you monitor usage and cost?
AWARENESS
• How do you decommission resources?

• How do you evaluate cost when you select services?


• How do you meet cost targets when you select resource
COST-EFFECTIVE type and size?
RESOURCES • How do you use pricing models to reduce cost?
• How do you plan for data transfer charges?

MATCHING SUPPLY
• How do you match supply of resources with demand?
& DEMAND

OPTIMIZING OVER TIME • How do you evaluate new services?

© 2 019 K E N T I K • C L O U D A R C H I T E C T U R E B E S T P R A C T I C E S PA G E 11
How Kentik supports the COST OPTIMIZATION pillar:
• Kentik can break down major contributors to bandwidth costs — whether it’s internet egress, cloud
interconnects, or inter-region traffic. Expenditure Awareness is one of the critical best practices for
this pillar.
• It’s important to match the supply of resources with demand and make cost-aware architecture
decisions. With Kentik, teams can spot underutilized resources (e.g. you are paying for a 10G direct
connect with only 1G average utilization). Identifying and remedying these situations can significantly
lower your cloud bill.

“Inter-Region Flows” view in the cloud to associate with inter-region cost

© 2 019 K E N T I K • C L O U D A R C H I T E C T U R E B E S T P R A C T I C E S PA G E 12
Conclusion
The AWS Well-Architected Framework is a great starting point for cloud adopters to learn, measure
and improve cloud infrastructure. These principals serve as cornerstones for delivering consistent and
premium user experience. Understanding the Framework can help teams to think “cloud-natively,” remind
them to design architecture with a consistent approach, and provide the visibility to understand potential
impacts and risks.

Kentik enables all five pillars of the Framework with a powerful analytics platform that provides the data,
visualization, and insight that teams need to design well-architected applications and manage them for
successful outcomes.

For more information on how to leverage the data from your infrastructure and put it into application and
business context, read our cloud visibility solution brief, reach out to us, or sign up for a free trial.

Reference:
• AWS Well-Architected: https://aws.amazon.com/well-architected/
• AWS Well-Architected Framework (November 2018): https://d1.awsstatic.com/whitepapers/
architecture/AWS_Well-Architected_Framework.pdf

FOR MODERN CLOUD VISIBILITY


Kentik is the modern analytics platform that understands the network behavior of your applications
and services across hybrid and multi-cloud environments. Kentik gives you the details and the big-
picture insight you need, at any scale, so that your cloud applications stay performant, reliable and
secure, and the promises of cloud infrastructure are delivered. For more information on our cloud
visibility offerings, download our Cloud Solution Brief or visit www.kentik.com.

Products from Kentik have patents pending in the US and elsewhere.

© 2 019 K E N T I K • C L O U D A R C H I T E C T U R E B E S T P R A C T I C E S PA G E 13

You might also like