You are on page 1of 31

DEV302

Monitor All Your Things: Amazon


CloudWatch in Action with BBC
Brian Dennehy Christopher Darlaston
Director of Engineering Development Lead
AWS BBC

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Monitoring matters because …

Visibility Real-time #Customer Applications


troubleshooting experience = $$

Operational Business

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Monolithic to Short-lived
microservice resources

Full stack
visibility

^Devices
^Data
Faster release
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
velocity
Highly scalable Cloud native
Metrics defaults
Logs Alarms

Events Agent
& APIs
Dash-
boards

Single solution for Monitor with


metrics and logs automation
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Collect Monitor Act Analyze

and Log analytics

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Christopher Darlaston—BBC

• Development lead in interactive TV

• Seven years in interactive TV on BBC


iPlayer, Sport, News and Frameworks

• Previous 13 years working at Sun


Microsystems in their web teams

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
BBC Interactive TV
overview
Giving users access to additional TV
programming.

Press the red button on your TV


remote control to enjoy additional
coverage from the big events:

• Glastonbury Festival (Music)


• Wimbledon (Tennis, Grand Slam)
• Olympic Games

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Simplified architecture—Unconnected Red Button

Amazon EC2

Carousel Creation
Amazon EC2 Amazon
DynamoDB

Amazon
AWS Amazon EFS
Kinesis
Lambda
AWS Direct
Amazon Carousel Storage Connect
Private S3

Amazon
CloudWatch Amazon EC2
Public
Carousel Injection
Main

Data Playout

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Collecting metrics and logs via CloudWatch agent
{
"metrics": {
"aggregation_dimensions": [ ["AutoScalingGroupName", "InstanceId"], ["AutoScalingGroupName”] ],
"append_dimensions": { "InstanceId": "${aws:InstanceId}”, "AutoScalingGroupName": "${aws:AutoScalingGroupName}” },
"metrics_collected": {
"mem": { "measurement": ["mem_used", "mem_cached", "mem_used_percent", "mem_available_percent”] },
"processes": { "measurement": ["running", "sleeping", "dead”] },
"disk": {"resources": ["/"], "measurement": ["free", "used_percent”] },
"netstat": {"measurement": ["tcp_established”] },
"cpu": { "totalcpu": false, "resources": ["*"], "measurement": ["cpu_usage_iowait", "cpu_usage_idle", "cpu_usage_nice”] }
},
"namespace": "live-broadcast-red-button-linkmanager-api"
},
"logs": {
"logs_collected": {
"files": {
"collect_list": [{
“file_path": "/var/log/broadcast-red-button-linkmanager-api/output.log",
"log_group_name": "live-broadcast-red-button-linkmanager-api-infrastructure-ApplicationLog-J8FGOWKDFOE8",
"log_stream_name": "{instance_id}-{ip_address}-output.log"
}]
}
},
"log_stream_name": "{instance_id}-{hostname}"
},
"agent": { "logfile": "/var/log/amazon-cloudwatch-agent/amazon-cloudwatch-agent.log”, "metrics_collection_interval": 60 }
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
}
Collecting metrics from log extraction

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Monitoring view—Typical day

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Alerting on issues using CloudWatch alarms

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Monitoring view—Day of trouble

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Diagnosing—Is it downstream or on premise ?

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Diagnosing—Is it upstream of us?

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Flexibility—Dashboard created during incident

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Monitoring view—Full day

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What did we learn?

Split
the problem space
Log
everything
Do you have the right
dashboards?

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Why do we use CloudWatch?
1. End-to-end visibility for on-premise
and cloud
Log analytics for both on-premise & Amazon Web
“Our interactive services, just like Services (AWS)
picking up your phone and making a
call, needs to just work at all times. 2. Monitoring with automation
We deliver journalistic content and Resource optimization, snapshot graphs
news, which are fundamental services
that our users expect in real-time and 3. Correlate & investigate issues in real
on-demand without failure.” time
CloudWatch agent & dashboards

4. More time back to focus on BBC


innovation

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What’s new

Reinvent & simplify: Lessons learned


inform our future

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
NEW: CloudWatch Automatic Dashboards
CloudWatch simplifies infrastructure monitoring with a default, getting started
experience

Dynamic, self-
updating AWS
infrastructure
dashboards

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Building operational dashboards takes time &
experience

“I just want a quick, summary view …”

“I just want some default recommendations …”

“Oh, not all statistics and visualizations are created equal …”

“I create dashboards one by one and someone always forgets …”

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Automatic Smart Dynamic Granular
Explore account & Browse defaults with Auto-scrub metrics of Easily drill down for
resource-based views of built-in AWS best resources that no longer troubleshooting with
health and performance practices, including exist to reduce stale AWS or resource group
metrics metrics, statistics, and views via resource-aware filtering
visualizations updates

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Session key takeaways

Collect Correlate Automate


everything with ease metrics and logs monitoring
using defaults for building for faster with new CloudWatch
operational visibility troubleshooting and automated operational
understanding root dashboards
cause

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What else is new:
Metric Math alarms
Log insights
CloudWatch agent with collectd and StatsD
integration
Snapshot graphs
Events support for AWS organizations
More sessions:
AWS booth for demos
DEV375 “Amazon CloudWatch Logs Is Making an Exciting Announcement!”
DEV311 “Breaking Observability Chaos: Best Practices to Monitor AWS Cloud Native Apps”
DEV301R “AIOPs: Find Your Needle in the Haystack”
DEV306R1 “Monitoring for Operational Outcomes and Application Insights: Best Practices Workshop”
DEV303R “Instrumenting Kubernetes for Observability Using AWS X-Ray and Amazon CloudWatch”
WIN202L “Leadership Session: Learn about 10 Years’ of Windows and .NET Innovation on AWS with
10 New Launches”
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
Brian Dennehy
Christopher Darlaston

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

You might also like