Professional Documents
Culture Documents
Goals
• Perform basic monitoring with Amazon CloudWatch and Amazon CloudWatch Logs
• Explain how AWS features can be integrated with third-party monitoring solutions
• Demonstrate how AWS CloudTrail and AWS Config can complement each other
• Provide a brief introduction of Amazon GuardDuty
1
Get the Most, from the Best!!
Topics
Amazon CloudWatch
◦ Monitoring
◦ Events
◦ Logging
AWS CloudTrail
AWS Config
Amazon GuardDuty
Get the Most, from the Best!!
Get the Most, from the Best!!
Monitor Analyze
Amazon
CloudWatch
CloudWatch Monitoring
Amazon CloudWatch dashboards enable you to create re-usable graphs and visualize
your cloud resources and applications in a unified view. You can graph metrics and
logs data side by side in a single dashboard to quickly get the context and go from
diagnosing the problem to understanding the root cause. For example, you can
visualize key metrics, like CPU utilization and memory, and compare them to capacity.
You can also correlate the log pattern of a specific metric and set alarms to be
proactively alerted about performance and operational issues. This gives you system-
wide visibility into operational health and the ability to quickly troubleshoot issues,
reducing Mean Time to Resolution (MTTR).
CloudWatch Logs
Amazon CloudWatch Logs lets you monitor and troubleshoot your systems and
applications using your existing system, application and custom log files. With
CloudWatch Logs, you can monitor your logs, in near real time, for specific phrases,
values or patterns. For example, you could set an alarm on the number of errors that
occur in your system logs or view graphs of latency of web requests from your
application logs. You can then view the original log data to see the source of the
problem. Log data can be stored and accessed indefinitely in highly durable, low-cost
storage so you don’t have to worry about filling up hard drives.
CloudWatch Events
Amazon CloudWatch Events delivers a near real-time stream of system events that
describe changes in Amazon Web Services (AWS) resources. Using simple rules that
you can quickly set up, you can match events and route them to one or more target
functions or streams. CloudWatch Events becomes aware of operational changes as
they occur. CloudWatch Events responds to these operational changes and takes
corrective action as necessary, by sending messages to respond to the environment,
activating functions, making changes, and capturing state information.
Get the Most, from the Best!!
Get the Most, from the Best!!
CloudWatch provides a powerful mechanism for monitoring the state and utilization
of most of the resources you are managing under AWS. The key concepts of
CloudWatch are:
Note that you can also read CloudWatch metrics programmatically using the CLI or
API, enabling you to build your own proactive alerting system using your own custom
scripts or applications.
You can store and view the metrics you collect with the CloudWatch Agent in
CloudWatch just as you can with any other CloudWatch metrics. The default
namespace for metrics collected by the CloudWatch agent is CWAgent, although you
can specify a different namespace when you configure the agent. The logs collected
by the unified CloudWatch Agent are processed and stored in CloudWatch Logs.
Get the Most, from the Best!!
CPU Utilization
(standard metric)
CloudWatch alarms can help notify users and systems like Amazon SQS when a metric
has been breached.
CloudWatch Monitoring also offers integration with many 3rd party tools.
Get the Most, from the Best!!
Metrics are grouped first by namespace, and then by the various dimension
combinations within each namespace. For example, you can view all Amazon EC2
metrics, Amazon EC2 metrics grouped by instance, or Amazon EC2 metrics grouped
by Auto Scaling group.
Only the AWS services that you're using send metrics to Amazon CloudWatch.
To view available metrics by namespace, dimension, or metric using the AWS CLI
Use the list-metrics command to list CloudWatch metrics. For a list of all service
namespaces, see AWS Namespaces. For lists of the metrics and dimensions for each
service, see Amazon CloudWatch Metrics and Dimensions Reference.
You can publish your own metrics to CloudWatch using the AWS CLI, API or
CloudWatch Agent. You can view statistical graphs of your published metrics with the
AWS Management Console.
CloudWatch stores data about a metric as a series of data points. Each data point has
an associated time stamp. You can even publish an aggregated set of data points
called a statistic set.
Get the Most, from the Best!!
Metric
◦ Name and Value
Namespace
◦ Group related metrics together
Dimensions
◦ Name/value pairs that further categorize metrics
◦ Example: InstanceId a dimension of CPUUtilization
◦ Metric Name + Dimension = a new, unique metric
Period
◦ Specified time (in seconds) over which metric was collected
This is a partial list of all of the possible parameters for a CloudWatch metric. For
more detailed information on basic CloudWatch concepts, see the Developer Guide
at
http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/cloudwatc
h_concepts.html.
Metrics
Metrics are the fundamental concept in CloudWatch. A metric represents a time-
ordered set of data points that are published to CloudWatch. Think of a metric as a
variable to monitor, and the data points represent the values of that variable over
time. For example, the CPU usage of a particular EC2 instance is one metric provided
by Amazon EC2. The data points themselves can come from any application or
business activity from which you collect data.
AWS services send metrics to CloudWatch, and you can send your own custom
metrics to CloudWatch. You can add the data points in any order, and at any rate you
choose. You can retrieve statistics about those data points as an ordered set of time-
series data.
Metrics exist only in the region in which they are created. Metrics cannot be deleted,
but they automatically expire after 15 months if no new data is published to them.
Data points older than 15 months expire on a rolling basis; as new data points come
in, data older than 15 months is dropped.
Metrics are uniquely defined by a name, a namespace, and zero or more dimensions.
Each data point has a time stamp, and (optionally) a unit of measure. When you
request statistics, the returned data stream is identified by namespace, metric name,
dimension, and (optionally) the unit.
Namespaces
A namespace is a container for CloudWatch metrics. Metrics in different namespaces
are isolated from each other, so that metrics from different applications are not
mistakenly aggregated into the same statistics.
There is no default namespace. You must specify a namespace for each data point
you publish to CloudWatch. You can specify a namespace name when you create a
metric. These names must contain valid XML characters, and be fewer than 256
characters in length. Possible characters are: alphanumeric characters (0-9A-Za-z),
period (.), hyphen (-), underscore (_), forward slash (/), hash (#), and colon (:).
The AWS namespaces use the following naming convention: AWS/service. For
example, Amazon EC2 uses the AWS/EC2 namespace.
Dimensions
A dimension is a name/value pair that uniquely identifies a metric. You can assign up
to 10 dimensions to a metric.
Every metric has specific characteristics that describe it, and you can think of
dimensions as categories for those characteristics. Dimensions help you design a
structure for your statistics plan. Because dimensions are part of the unique identifier
for a metric, whenever you add a unique name/value pair to one of your metrics, you
are creating a new variation of that metric.
AWS services that send data to CloudWatch attach dimensions to each metric. You
can use dimensions to filter the results that CloudWatch returns. For example, you
can get statistics for a specific EC2 instance by specifying the InstanceId dimension
when you search for metrics.
For metrics produced by certain AWS services, such as Amazon EC2, CloudWatch can
aggregate data across dimensions. For example, if you search for metrics in
the AWS/EC2 namespace but do not specify any dimensions, CloudWatch aggregates
all data for the specified metric to create the statistic that you requested. CloudWatch
does not aggregate across dimensions for your custom metrics.
Period
A period is the length of time associated with a specific Amazon CloudWatch statistic.
Each statistic represents an aggregation of the metrics data collected for a specified
period of time. Periods are defined in numbers of seconds, and valid values for period
are 1, 5, 10, 30, or any multiple of 60. For example, to specify a period of six minutes,
use 360 as the period value. You can adjust how the data is aggregated by varying the
length of the period. A period can be as short as one second or as long as one day
(86,400 seconds). The default value is 60 seconds.
Get the Most, from the Best!!
OK ALARM INSUFFICIENT
DATA
Note that “ALARM” is just a name given to the state, and does not necessarily signal
an emergency condition requiring immediate attention. It merely means that the
monitored metric is equal to, greater than or less than a specified threshold value.
You could, for example, define an alarm that tells you when your CPUCreditBalance
for a given T2 instance is running low. You may process this notification
programmatically to suspend a CPU-intensive job on the instance until your T2 credit
balance is once again full.
Get the Most, from the Best!!
6
After 3 periods over
threshold, an action
is invoked
5 Only one period
over threshold, no
action is invoked
3
Units
Threshold
Value
2
Time Periods
In the following figure, the alarm threshold is set to 3 and the minimum breach is 3
periods. That is, the alarm invokes its action only when the threshold is breached for
3 consecutive periods. In the figure, this happens with the third through fifth time
periods, and the alarm's state is set to ALARM. At period six, the value dips below the
threshold, and the state reverts to OK. Later, during the ninth time period, the
threshold is breached again, but not for the necessary three consecutive periods.
Consequently, the alarm's state remains OK.
You can get aggregated views of the health and performance of all AWS resources
through CloudWatch Automatic Dashboards. This enables you to quickly get started
with monitoring, explore account and resource-based view of metrics and alarms,
and easily drill-down to understand the root cause of performance issues.
Automatic Dashboards are pre-built with AWS service recommended best practices,
remain resource aware, and dynamically update to reflect the latest state of
important performance metrics. You can now filter and troubleshoot to a specific
view without adding additional code to reflect the latest state of your AWS resources.
Once you have identified the root cause of a performance issue, you can quickly act
by going directly to the AWS resource.
The upper left shows a list of AWS services you use in your account, along with the
state of alarms in those services. The upper right shows two or four alarms in your
account, depending on how many AWS services you use. The alarms shown are those
in the ALARM state or those that most recently changed state.
These upper areas enable you to assess the health of your AWS services, by seeing
the alarm states in every service and the alarms that most recently changed state.
This helps you monitor and quickly diagnose issues.
Below these areas is the custom dashboard that you have created and
named CloudWatch-Default, if any. This is a convenient way for you to add metrics
about your own custom services or applications to the overview page, or to bring
forward additional key metrics from AWS services that you most want to monitor.
If you use six or more AWS services, below the default dashboard is a link to the
automatic cross-service dashboard. The cross-service dashboard automatically
displays key metrics from every AWS service you use, without requiring you to choose
what metrics to monitor or create custom dashboards. You can also use it to drill
down to any AWS service and see even more key metrics for that service. If you use
fewer than six AWS services, the cross-service dashboard is shown automatically on
this page.
From this overview, you can focus your view to a specific resource group or a specific
AWS service. This enables you to narrow your view to a subset of resources in which
you are interested. Using resource groups enables you to use tags to organize
projects, focus on a subset of your architecture, or just distinguish between your
production and development environments.
Get the Most, from the Best!!
By default, your instance is enabled for basic monitoring with data available
automatically in 5-minute periods as part of the free tier.
You also have the option of enabling detailed monitoring. After you enable detailed
monitoring, the Amazon EC2 console displays monitoring graphs with a 1-minute
period for the instance.
Amazon CloudWatch Events delivers a near real-time stream of system events that
describe changes in Amazon Web Services (AWS) resources. Using simple rules that
you can quickly set up, you can match events and route them to one or more target
functions or streams. CloudWatch Events becomes aware of operational changes as
they occur. CloudWatch Events responds to these operational changes and takes
corrective action as necessary, by sending messages to respond to the environment,
activating functions, making changes, and capturing state information.
You can also use CloudWatch Events to schedule automated actions that self-trigger
at certain times using cron or rate expressions. For more information, see Schedule
Expressions for Rules.
Before you begin using CloudWatch Events, you should understand the following
concepts:
Events—An event indicates a change in your AWS environment. AWS resources can
generate events when their state changes. For example, Amazon EC2 generates an
event when the state of an EC2 instance changes from pending to running, and
Amazon EC2 Auto Scaling generates events when it launches or terminates instances.
AWS CloudTrail publishes events when you make API calls. You can generate custom
application-level events and publish them to CloudWatch Events. You can also set up
scheduled events that are generated on a periodic basis. For a list of services that
generate events, and sample events from each service, see CloudWatch Events Event
Examples From Each Supported Service.
Targets—A target processes events. Targets can include Amazon EC2 instances, AWS
Lambda functions, Kinesis streams, Amazon ECS tasks, Step Functions state machines,
Amazon SNS topics, Amazon SQS queues, and built-in targets. A target receives
events in JSON format.
Rules—A rule matches incoming events and routes them to targets for processing. A
single rule can route to multiple targets, all of which are processed in parallel. Rules
are not processed in a particular order. This enables different parts of an organization
to look for and process the events that are of interest to them. A rule can customize
the JSON sent to the target, by passing only certain parts or by overwriting it with a
constant.
In this example, a CloudWatch Event rule is being created, where every time a new
instance is created, an SSM run command script is executed on the instance.
Get the Most, from the Best!!
Get the Most, from the Best!!
2. Collect
Upload log files from disparate cloud resources into a
central location (Amazon S3, Amazon EMR, third-party
solution).
3. Analyze
Examine your data using
1. Configure analytics tools, like
Decide what information to save CloudWatch Log Insights,
for each service. For custom to report on errors,
services, choose file format, investigate problem
delimiters, and fields. reports, new release
monitoring, and trend
measurement.
The process of log analysis can be thought of as having three distinct phases:
1. Configure: In the Configure stage, you decide what information you need to
capture in your logs, and where and how it will be stored. Log information is
typically saved in a plain text format, with distinct fields in an entry separated by a
delimiter (spaces, tabs, commas, etc.). At this point, you also need to decide on
the format of each field. For example, what is the canonical format of dates in
your log file? Have you taken care to ensure that it is consistent from field to field,
and across all server instances?
2. Collect: As we discussed in detail in the last module, instances come and go in a
cloud environment; you need a strategy for periodically uploading a server’s log
files so that this valuable information is not lost when an instance is eventually
terminated.
3. Analyze: After all of the data is collected, it’s time to analyze. Using log data gives
you greater visibility into the daily health of your systems. It can also provide
information on upcoming trends in customer behavior, and insight into how
customers are currently using your system. You can also use CloudWatch Logs
Insights to analyze your logs.
Get the Most, from the Best!!
Back in the Dark Ages (that is, around July 2014), the only way to push log files off of
an Amazon EC2 instance was to write custom scripts and scheduling jobs that
uploaded these logs to Amazon S3 or a third party service. The CloudWatch Logs
feature simplifies this process by providing the ability to automatically upload any
logs from an Amazon EC2 instance to AWS. You can then configure filter patterns to
generate custom CloudWatch metrics that measure the number of occurrences of a
particular string or set of strings within a log group. For example, you can screen for a
specific HTTP error code, count the number of occurrences of a single line or item
within a line, or look for a specific element within a fixed position in a space-
delimited string.
With CloudWatch, you can aggregate data from a number of different Amazon EC2
instances into what are called log groups. Each log group should represent a specific
type of log with a set format. The individual reports of logs coming in from instances
are referred to as log streams. A program called the CloudWatch Logs Agent sits on
each instance, gathers data from application logs, and sends it to the appropriate log
group. An administrator can then create filters on a log group to look for specific
strings. Each match is assigned a numeric value, which is then used to increment a
custom CloudWatch metric. Administrators can then use that metric like they would
any other custom CloudWatch metric – create alarms, send notifications, etc.
404Count
(Custom CloudWatch Metric)
Log streams
HttpAccessLog Log Group Generate IT Support Alert or
Trouble Ticket
Example:
Apache httpd logs configured using substitution string in httpd.conf
Result:
A space-delimited string containing information on each HTTP/HTTPS
request
Case-sensitive
Filter pattern terms are ANDed together
Designate complex conditions by defining the field patterns in the
awslogs.conf log file
Example:
Create a metric for all results with the string html anywhere in the
request, and any HTTP 400 (client) error
Filter pattern terms are ANDed together. If you have a filter pattern of “ERROR
Exception”, it will find all log entries with both “ERROR” and “Exception” anywhere in
the string.
To perform more complex tasks, you can define all of the fields of a space-delimited
log entry using brackets, and define tests for each bracket. Tests may use any of the
standard equality operators (>, <, >=, <=, =, and !=). Tests may also use an asterisk (*)
to specify block of arbitrary content before, after, or in the middle of the field.
AWS CloudTrail is an AWS service that generates logs of calls to the AWS API. Because
the AWS API underlies both the Command Line Interface (CLI) and the AWS
Management Console, AWS CloudTrail can record all activity against the services it
monitors.
Get the Most, from the Best!!
AWS CloudTrail can help you answer questions requiring detailed analysis.
Using AWS CloudTrail, you can store logs on API usage in an Amazon S3 bucket, and
then later analyze those logs to answer a number of compelling questions, such as:
• Why was a long running instance terminated, and who terminated it?
(Organizational traceability and accountability)
• Who changed a security group configuration? (Accountability and security
auditing)
• Is any activity coming from an unknown IP address range? (Potential external
attack against the public facing network)
• What activities were denied due to lack of permissions? (Potential internal or
external attack against the network)
Get the Most, from the Best!!
You can optionally configure the following settings when you create or update a trail
with the CloudTrail console or the AWS Command Line Interface (AWS CLI). Both
methods follow the same steps:
• Start by creating a trail. By default, when you create a trail in a region in the
CloudTrail console, the trail applies to all regions.
• Create an Amazon S3 bucket or specify an existing bucket where you want the log
files delivered. By default, log files from all regions in your account are delivered to
the bucket that you specify.
• Configure your trail to log read-only, write-only, or all management and data
events. By default, trails log all management events.
• Create an Amazon SNS topic to receive notifications when log files are delivered.
Delivery notifications from all regions are sent to the topic that you specify.
• Configure CloudWatch Logs to receive your logs from CloudTrail so that you can
monitor for specific log events.
• Turn on log file encryption. This encrypts your files for added security.
• Turn on integrity validation for log files. This enables the delivery of digest files that
you can use to validate the integrity of log files after CloudTrail has delivered them.
• Add tags (custom key-value pairs) to your trail.
{ "eventVersion" : "1.01",
"userIdentity" : {
"type" : "IAMUser",
"principalId" : "AIDAyyyyyyyyyyyyyyyy",
"arn" : "arn:aws:iam::xxxxxxxxxxxx:user/tests3user",
"accountId" : "xxxxxxxxxxxx",
"userName" : "tests3user"
},
"eventTime" : "2018-09-23T22:41:38Z",
"eventSource" : "signin.amazonaws.com",
"eventName" : "ConsoleLogin",
"awsRegion" : "us-east-1",
"sourceIPAddress" : "54.240.217.10",
"userAgent" : "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0)
Gecko/20100101 Firefox/31.0",
The text on this slide shows the head of a typical CloudTrail log—in this case, a login
request. Note that the request fully identifies the user, the time of the event, the
source of the request, user agent string, and the event that occurred (ConsoleLogin).
Get the Most, from the Best!!
"responseElements" : {
"ConsoleLogin" : "Success"
},
"additionalEventData" : {
"MobileVersion" : "No",
"LoginTo" :
"https://console.aws.amazon.com/console/home?state\u003dhashArgs%23\
u0026isauthcode\u003dtrue",
"MFAUsed" : "No"
},
"eventID" : "b31716e9-13e5-4fb5-9c60-1c723c59f5a6"
}
The text on this slide is a continuation of the previous event that displays a successful
outcome—i.e., the user was able to log in successfully.
Get the Most, from the Best!!
The text on this slide shows the bottom portion of a log that captures an unsuccessful
login attempt for the same user. An errorMessage field is included with all the failed
events.
Get the Most, from the Best!!
Examples:
◦ Failed AWS Management Console sign-in attempts; sign-ins from
suspicious IPs
◦ Unauthorized access to services using the API
◦ Suspicious launches of resources
Get the Most, from the Best!!
Get the Most, from the Best!!
AWS Config is a fully managed service that provides you with an AWS resource
inventory, configuration history, and configuration change notifications to enable
security and governance. Config Rules enables you to create rules that automatically
check the configuration of AWS resources recorded by AWS Config.
With AWS Config, you can discover existing and deleted AWS resources, determine
your overall compliance against rules, and dive into configuration details of a
resource at any point in time. These capabilities enable compliance auditing, security
analysis, resource change tracking, and troubleshooting.
Get the Most, from the Best!!
AWS Config Rules is a new set of cloud governance capabilities that allow IT
Administrators to define guidelines for provisioning and configuring AWS resources
and then continuously monitor compliance with those guidelines. AWS Config Rules
lets you choose from a set of pre-built rules based on common AWS best practices or
custom rules that you define.
Get the Most, from the Best!!
For example, ensure that Amazon EBS volumes are encrypted, EC2 instances are
properly tagged, and elastic IP addresses (EIPs) are attached to instances. AWS Config
Rules can continuously monitor configuration changes to your AWS resources and
provides a new dashboard to track compliance status. Using Config Rules, an IT
Administrator can quickly determine when and how a resource went out of
compliance.
Get the Most, from the Best!!
AWS Config is integrated with AWS CloudTrail. CloudTrail captures all API calls from
the AWS Config console or from the AWS Config API. Using the information collected
by CloudTrail, you can determine what request was made to AWS Config, the source
IP address from which the request was made, who made the request, when it was
made, and so on.
Get the Most, from the Best!!
Get the Most, from the Best!!
Cloud security at AWS is the highest priority. As an AWS customer, you will benefit
from a data center and network architecture built to meet the requirements of the
most security-sensitive organizations.
An advantage of the AWS cloud is that it allows customers to scale and innovate,
while maintaining a secure environment. Customers pay only for the services they
use, meaning that you can have the security you need, but without the upfront
expenses, and at a lower cost than in an on-premises environment.
Amazon GuardDuty threat detection identifies activity that can be associated with
account compromise, instance compromise, and malicious reconnaissance. For
example, GuardDuty detects unusual API calls, suspicious outbound communications
to known malicious IP addresses, or possible data theft using DNS queries as the
transport mechanism. GuardDuty delivers more accurate findings using machine
learning enriched by threat intelligence, such as lists of malicious IPs and domains.
With a few clicks in the AWS Management Console, Amazon GuardDuty can be
enabled and customers can have a more intelligent and cost-effective option for
threat detection in the AWS Cloud.
Get the Most, from the Best!!
Account
Data source
Finding
Trusted IP list
Threat list
Account is a standard AWS account that contains your AWS resources. You can sign in
to AWS with your account and enable GuardDuty.
You can also invite other accounts to enable GuardDuty and become associated with
your AWS account in GuardDuty. If your invitations are accepted, your account is
designated as the master GuardDuty account, and the added accounts become
your member accounts. You can then view and manage those accounts' GuardDuty
findings on their behalf.
Data source is the origin or location of a set of data. To detect unauthorized and
unexpected activity in your AWS environment, GuardDuty analyzes and processes
data from the following data sources:
• AWS CloudTrail event logs
• VPC Flow Logs
• DNS logs
Enable GuardDuty: With a few clicks in the console, monitor your AWS accounts
without additional security software or infrastructure to deploy or manager.
Leverage actionable alerts: Review detailed findings in the console, integrate into
event management or workflow systems, or trigger AWS Lambda for automated
remediation or prevention.
Get the Most, from the Best!!
Get the Most, from the Best!!