You are on page 1of 200

Architecting and Deploying Splunk

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 1 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Document Usage Guidelines
• Should be used only for enrolled students
• Not meant to be a self-paced document, an instructor is required
• Please do not distribute

13 May 2016
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 2 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Course Prerequisites
• Required
–  Using Splunk
–  Creating Splunk Knowledge Objects
–  Splunk Administration

• Strongly Recommended
–  Searching and Reporting with Splunk
–  Advanced Dashboards and Visualizations

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 3 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Course Goals
• Understand best practices for architecting and documenting Splunk
Enterprise distributed deployments
• Provide a framework and methodology for Splunk deployments

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 4 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Course Outline
1.  Introduction
2.  Initial Requirements Definition
3.  Apps and Index Design
4.  Infrastructure Planning
5.  Data Collection
6.  Forwarders and Deployment Management
7.  Data Comprehension
8.  Search Considerations
9.  Integration
10.  Operations and Management
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 5 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module 1:
Introduction

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 6 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module Objectives
• Introduce the Splunk deployment planning process and tools
• Review the structure of this course

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 7 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Deployment Scaling
A deployment plan creates a solid foundation
• to scale Splunk deployments as they evolve
Expansion
• to implement large, enterprise deployments
Enterprise
Workgroup Deployment

Download
•  Enterprise standard
•  Large number of users
•  Many different use cases
•  More sites
•  Many different users
•  More geographies
•  Specific use case •  More data sources
•  Initial User •  Specific users •  More data volume
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 8 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Splunk Deployment Methodology
• Provides a structured path to implement Splunk in an enterprise
• Identifies the people and resources required, along with responsibilities
• Provides a task checklist and ordering
•  It consists of six phases: Infrastructure

1.  Infrastructure - ensure Splunk has the proper resources


(servers, sizing, topology, disk, security, retention)
Integra'on Collec'on
2.  Collection - ingest the correct data into Splunk Opera'on

3.  Comprehension - tell Splunk about the data


4.  Querying - get information out of Splunk
5.  Integration - provide search results to a wider audience Querying Comprehension

6.  Operation - keep it operating, grow it, evolve it


Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 9 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
What is in a deployment plan?
• A deployment plan should include
–  Deployment Goals –  Suggested Splunk Apps
–  User Roles / Staffing List –  Report Definitions
–  Existing Topology Diagrams –  Education / Training Plan
ê  physical environment –  Deployment Schedule
ê  logging
–  Splunk Deployment Topology
–  Data Source Inventory
–  Data Policy Definition

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 10 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module 1 Lab Exercise
Time: 10 minutes
• Task: Read the lab document for Module 1 (pages 1-4)
–  DeployingSplunk64_Labs.pdf
–  If you have difficulty seeing the graphics in the lab document, they also appear on
the following three pages
• Task: Download planning tools (optional)
–  Data Source inventory (data_inventory.rtf)
–  Index Planning and Sizing (data_sizing.xlsx)
–  Splunk Icon Library (splunk_icons.pptx)
–  Use Case checklist (use_case.pdf)

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 11 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Current IT Environment

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 12 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Current Logging Environment - Corporate

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 13 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Current Logging Environment - CoLo

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 14 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module 2:
Initial Requirements
Definition

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 15 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module Objectives
• Identify the information that is needed for deployment decisions
• Provide lists and resources to aid in collecting requirements

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 16 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Key Planning Information
• Identifying requirements is a first step in the deployment
–  More information will be uncovered throughout the phases of deployment

• Gather raw material for the deployment plan at the beginning, if possible
–  Overall goals for the deployment
–  Key users, including their goals and use cases
–  Current environment
ê  physical environment
ê  monitoring tools in use
–  Data sources

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 17 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Overall Goals
• Who is the sponsor?
•  In addition to the sponsor, who else must be satisfied?
• What are the goals for the deployment?
• How will success be measured?

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 18 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Use Cases
• What will the users do with Splunk?
–  What tasks will they perform?
–  What reports will they generate?

• What other use cases may apply?


• For ideas and inspiration:
–  Why Splunk? Customer success stories, whitepapers, etc.
http://www.splunk.com/view/SP-CAAAG2Q
–  Use Case Checklist handout
use_case.pdf
–  Splunk apps
http://splunkbase.splunk.com
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 19 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Current IT Environment
• What is the current overall IT topology?
–  Data centers
–  Network zones
–  Number and type of servers
–  Location of users

• Obtain a network diagram


–  Are there security restrictions among data center and network zones?
–  What is the bandwidth among data center and network zones?

• Authentication
–  What system is in current use?

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 20 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Current Logging Environment
• Are any logs or other data sources collected or centralized today?
–  Logged to SAN / NAS devices
–  Centralized via syslog, syslog-ng, rsyslog, Kiwi, Snare, etc.
–  Parsed and stored in a SQL database

• What tools are in use?


–  Are there log parsing / scraping tools or scripts in place?
–  Are there any query tools in place? Who uses each tool?
–  Are logs a source for a monitoring system?
–  What ticketing system is in place?

• Is there an expectation that Splunk will integrate with or replace existing tools?
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 21 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
General Requirements
• Security
–  What security policies may affect the collection, retention, and reporting of data?
–  What approvals will be needed?

• Regulatory
–  Are there laws, regulations, or policies that affect the data collection, reporting,
etc.?
• High Availability or Disaster Recovery
–  Is data replication required?

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 22 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Data Sources
• Data Source Inventory
–  What is the superset of data sources needed by all users?
–  How much data is generated per day?

• Data Policy
–  How long should each data source be retained?
–  Who can see particular data elements?
–  What data needs protection against tampering?
–  What proof of integrity needs to be provided?
–  Will Splunk be the primary repository for data?

• Handout
–  data_inventory.rtf
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 23 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Example Data Source Inventory
Data Source: Firewall logs
Location and Type: UDP:514
file input (name and path), API,
network port, device, etc.
Ownership and Access: Network team owns; Security can also access
Data Volume
Average per day: 500 MB
Peak: 6000 EPS at ~100 bytes/event
Retention: 6 months
Format or sourcetype: syslog?
Collection method: How will this data get to the indexers? Is there
a collection point at each location?
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 24 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module 2 Lab Exercise
Time: 15 minutes
• Task: Complete the abbreviated data source inventory
–  You may not know enough to supply answers for all data sources
–  Part of the inventory has been completed for you
ê  Do you agree with the completed portion?
ê  Do you have questions for the sponsor?

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 25 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module 3:
Apps and Index Design

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 26 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module Objectives
• Understand index implementation
• Design indexes Infrastructure

• Understand how to estimate storage


requirements for indexes Integra'on Collec'on

• Identify relevant apps Opera'on

–  Document impact on inputs and


indexes

Querying Comprehension

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 27 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Splunk Indexing Review
• Optimized for quickly indexing and persisting unstructured data
• Minimal schema for persisted data
–  An event consists of raw text, timestamp, source, source type and host
–  This is called "rawdata" and is stored in a compressed file

• Once data enters Splunk, it is


–  Parsed for line-breaking, timestamps and other metadata fields
–  Persisted in its raw form
–  Indexed by the metadata fields, along with all the keywords in the raw event text
ê  Splunk uses an "inverted index" structure for the .tsidx files

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 28 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Splunk Indexing Review (cont.)
• Additional processing on raw events is deferred until search time
• This serves 4 important goals
-  indexing speed is increased minimal processing is
performed during indexing
-  bringing new data into the no schema planning is needed
system requires less effort
-  the original data is persisted
for easy inspection
-  the system is resilient to data parsing problems do not
change require reloading or re-indexing
of the data

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 29 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Disk Storage Details – Files
• As Splunk indexes your data, it creates two main types of files:
–  rawdata contains the original data in a compressed form
ê  for "syslog-like" data, this may be 10% to 15% of the raw pre-indexed data
ê  size is dependent on how well data compresses
–  Index files (.tsidx) which point to the unique terms
ê  these files range in size from approximately 10% to 110% of the rawdata size
ê  size is strongly affected by the number of unique terms in the data
ê  adding indexed field extractions will increase the size of the .tsidx files
• Buckets contain both rawdata and index files
–  Exception: replicated buckets may have only rawdata

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 30 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Types of Index Files
  Inverted index (*.tsidx)
  Metadata (*.data)
–  Information about sources, sourcetypes and hosts of the events
contained in each bucket
  Bloom filters (bloomfilter)
–  Efficient data structure that authoritatively rules out buckets
ê  Provides 100% certainty that a search term is not present in a bucket
–  Consulted by every search
–  Very fast (1-2 IO)
https://en.wikipedia.org/wiki/Bloom_filter

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 3131 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Inverted Index
• Index data structure that maps from content
(such as words or numbers) to its locations red sweater
en.wikipedia.org/wiki/Inverted_index
blue
• An inverted index allows fast full text searches pants
red blue sweater
• Splunk .tsidx uses an inverted index sweater
structure to map keywords to their locations in
the rawdata red pants

Source: Wikipedia
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 32 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Splunk Index Files
IDX 3
IDX 2
IDX 1
Source / Sourcetype / Host Metadata
1 source : : /my/log 100 et lt it
2 source: : /blah 150 et lt it
Home Path

hot_v1_100
TSIDX
*.data
*.tsidx apple beer coke LEXICON
rawdata cream ice java …

hot_v1_101 apple POSTING


beer
db_lt_et_101

Rawdata
Cold Path
db_lt_et_80 "apple pie and ice cream
are delicious"
"an apple a day keeps
Thawed Path doctor away"
db_lt_et_70

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 33 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Managing disk usage with tsidx reduction
• Full index files are normally kept with the rawdata until the bucket is frozen
• To save disk space, you can enable tsidx reduction in indexes.conf
–  Based on an age setting
ê  timePeriodInSecBeforeTsidxReduction = 7776000 (90 days)
–  Replaces some index files with mini-index files
–  Removes merged_lexicon.lex
–  Overall, the bucket size is typically reduced by 1/3 to 2/3

• Searches will take much longer over reduced buckets


ê  Searches for rare terms are particularly affected
http://docs.splunk.com/Documentation/Splunk/latest/Indexer/Reducetsidxdiskusage
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 34 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Estimating Indexing Volume
• Estimate Input Volume (Data Capacity)
- Verify raw log sizes
- Daily, peak, retained, future volume
- Total number of data sources
- Total number of hosts
- Add volume estimates to data source
inventory / spreadsheet
• A rule of thumb for syslog-type data is that once it has been compressed
and indexed in Splunk, it occupies approximately 50% of its original size:
–  15% for the rawdata file
–  35% for associated index files

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 35 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Testing Indexing Compression
• Confirm estimates with actual data
–  Create a baseline with real or simulated data
–  Find compression rates
–  Leverage _internal index metrics to determine actual input volumes

• To test specific types of data


–  Index a sample of data, then check the sizes of the resulting directories in
defaultdb
–  See documentation for step-by-step space estimation method:
http://docs.splunk.com/Documentation/Splunk/latest/Capacity/
Estimateyourstoragerequirements

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 36 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Sizing Indexes Example
Assuming
•  No replication
•  Raw data compression is 15% of daily volume
•  Index file compression is 35% of daily volume

Input Daily Retention Visibility Estimated Index


Volume Size on Disk
access.log 350 GB 60 days Web, Security 10.5 TB web
secure.log 175 GB 90 days Security, Ops 7.9 TB ops
ecommerce.log 125 GB 6 months Security, 11.3 TB ecommerce
(180 days) Customer
Service
system.log 250 GB 90 days Security, Ops 11.3 TB ops
Total 900 GB 41.0 TB

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 37 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Splunk Apps
• Which Splunk apps are appropriate?
–  Apps are often chosen based on
ê  devices or technologies in the production environment
ê  use cases
ê  inputs
• Apps often have particular infrastructure
requirements
• Most apps are free http://www.splunkbase.com

• While apps can always be added later, plan up


front if possible
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 38 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Apps and Indexes
• Many apps define inputs and indexes
• Check apps against your initial index design and data source inventory
–  You may need to acquire new data sources to support the app
ê  add them to the inventory
–  Identify the indexes created by the app
–  Identify which inputs will map to which indexes
ê  there should be no overlap or conflicts
ê  each data source goes to a single index
ê  include any sourcetype information that is defined in the app
Apps must be integrated with the overall index and inputs

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 39 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Key Criteria for Index Design
• When to partition data into different indexes
–  Retention
ê  How long should the data be kept? Online? Offline?
ê  All retention settings apply on a per-index basis
ê  All data sources within an index should have the same retention
–  Access
ê  Who can search against the data?
ê  All data sources within an index will have the same visibility
–  access to indexes is controlled by role

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
40 Listen
Listentotoyour
yourdata.
data. 40 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Additional Design Criteria
• Search Performance
–  Less common to differentiate here, but may be important
–  Mixed data = mixed lexicon, larger number of keywords, therefore larger index
–  Things searched together can be indexed together
–  Example:
ê  Both a high-volume / high-noise data source and a low-volume data source feed into
the same index
ê  If you search mostly for events from the low-volume data source, the search speed
will be slower than necessary
ê  To improve performance, create a separate index for the high-volume data source

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 41 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module 3 Lab Exercise
Time: 15 minutes
• You may want to use the spreadsheet data_sizing.xlsx
• Task: Estimate the disk space required
• Task: Index Design
–  Identify the indexes
–  Estimate the Splunk license required for indexing

• Challenge: Identify potential apps


–  Identify apps for the environment
–  Add any index and input information from the apps to your solution

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 42 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module 4:
Infrastructure Planning

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 43 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module Objectives
• Define environment requirements
• Understand sizing factors for servers Infrastructure

• Understand reference servers


• Understand the planning impact of Integra'on Collec'on

clustering Opera'on

Querying Comprehension

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 44 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Foundation for Splunk Deployment
• A low latency network
Standard 

Distributed
 Internet 

Search Connection

–  1 Gb is the minimum bandwidth


High Bandwidth, Low Latency
Indexers
• A solid enterprise wide time infrastructure
–  NTP for time synchronization

• A solid Domain Name Service (DNS) Standard 



Internet Connection
–  Splunk can significantly increase load on DNS Forwarders

• Turn off Transparent Huge Pages (THP)


–  THP is a feature on some Linux distros

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 45 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Basic Sizing Considerations

  Amount of incoming data


  Amount of indexed (stored) data
  Number of concurrent users
  Number of scheduled searches
  Types of searches
  Specific Splunk apps
–  Can affect incoming and indexed data
–  Can affect number of concurrent searches
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 46 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
How Apps Affect Infrastructure Sizing
•  Review the documentation for each app
–  Identify any impact on the Splunk deployment
•  For example
–  Splunk App for Enterprise Security places a heavy load on
search heads
–  Splunk App for VMware and Checkpoint LEA app both require a
heavy forwarder for data collection
–  More details about the Splunk App for Enterprise Security and
the Splunk App for VMware are on the following pages

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 4747 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Splunk App for Enterprise Security (ES)
• Infrastructure Impacts
–  A dedicated search head or search head cluster
–  One indexer per 100GB data indexed per day
ê  Assumes 15 correlation searches running
ê  Additional data storage for accelerated data models, estimated as
data indexed per day * 3.4

http://docs.splunk.com/Documentation/ES/latest/Install/Overview

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 48 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Splunk App for VMware
• Infrastructure Impacts
–  This app adds approximately 300 MB of data per ESXi host per day to Splunk
–  Requires data collection nodes running as VMs in the VMware environment
ê  the data collection node is distributed as an Open Virtual Appliance (OVA) file
ê  each data collection node can collect data from 40 or fewer ESXi hosts, with a ratio
of 25 to 30 virtual machines per ESXi host

http://docs.splunk.com/Documentation/VMW/latest/Installation/Aboutthismanual

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 49 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Splunk App for IT Service Intelligence (ITSI)
• Infrastructure Impacts
–  A dedicated search head or search head cluster
–  Additional infrastructure may be needed, depending on the number of Key
Performance Indicators (KPIs) that are tracked
ê  The documentation has recommendations

http://docs.splunk.com/Documentation/ITSI/latest/Configure/Performancerecommendations

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 50 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Splunk User Behavior Analytics (UBA)
• Hardware requirements
–  50GB disk space for Splunk UBA installation
–  500GB additional disk space for metadata
–  16 CPU cores
–  64 GB RAM

http://docs.splunk.com/Documentation/UBA/latest/Install/Requirements

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 51 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Splunk Components
What are the requirements for each type of Splunk instance?

Indexer Search Head Deployment License Heavy Cluster Search Head Deployer
Splunk Enterprise (Search peer) Server Master Forwarder Master Cluster

The heavy forwarder and universal forwarder


are discussed in a later module.
Sizing and other considerations for all other
components are discussed here.
Deployment Client
Universal Forwarder

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 52 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Component Types – Indexer
• The Splunk component that indexes data, transforming raw data into
events and placing the results into an index
• It also searches the indexed data in response to search requests

Index Time Search Time

Index
Search Head

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 53 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Indexer – Reference Server
Indexer

•  One reference indexer per 250 GB / day


ingestion rate in a distributed deployment
Hardware Intel 64-bit Xeon or equivalent chip
•  Assumes: architecture
–  Peak load is about 3x daily average CPU 2 x 6-core (12 cores total) at 2+ GHz per
core
–  Light searching for 2 or 3 concurrent users
Memory 12 GB RAM
•  Need additional servers for: Disk Disk subsystem capable of 800 IOPS
–  Increased reporting, searching, users (e.g. 8x15K RPM SAS drives in RAID 1+0
configuration)
•  Indexing alone can use 4 full cores at full load Network Standard 1 Gb Ethernet NIC
Optional 2nd NIC for management network
•  Each concurrent search needs a full core OS Linux or Windows - 64-bit version
–  More servers reduce search duration and
increase search throughput

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 54 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Horizontal Scaling
• Adding indexers scales capacity Search
–  Allows a greater daily indexing volume Head

–  Speeds searches
• When using multiple indexers
–  Use Splunk’s built-in forwarder load
balancing Indexers

–  Use distributed search


• When in doubt, the first rule of scaling is
to add another commodity indexer
Auto Load 

Balanced 

Forwarders

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 55 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Increasing Indexer Utilization
• Normally, the indexer uses approximately 4-6 cores for indexing
–  Possibly underutilizing the indexer hardware

• You can configure multiple pipeline sets to increase hardware utilization


–  Available since 6.3

splunkd

splunkd

Single pipeline set


Multiple pipeline sets

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 56 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Disk Storage Details – About IOPS
• Input/Output Operations Per Second (IOPS) measures disk throughput
–  Hard drives read and write at different speeds
ê  average IOPS is the blend between read and write speed
–  To get the most IOPS, choose drives with
ê  high rotational speeds
ê  low average latency and seek times
• Splunkbase has an app to analyze disk performance using Bonnie++
https://splunkbase.splunk.com/app/3002/
• Links to additional information on determining IOPS
www.symantec.com/connect/articles/getting-hang-iops-v13
www.cmdln.org/2010/04/22/analyzing-io-performance-in-linux
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 57 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Disk Storage Details
• Splunk indexing is read / write intensive
• Recommended RAID setup is RAID 10 (1 + 0)
–  Provides fast reads and writes with the greatest redundancy
–  Do NOT use RAID 5 since the duplicate writes make it too slow for high
performance indexing

Device Type IOPS Interface


7,200 RPM SATA drives HDD ~75 - 100 SATA 3 Gb/s
10,000 RPM SATA drives HDD ~125 - 150 SATA 3 Gb/s
10,000 RPM SAS drives HDD ~140 SAS
15,000 RPM SAS drives HDD ~175 - 210 SAS

Source: Wikipedia

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 58 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Disk Storage Details – SAN and NAS
• Storage Area Networks (SAN) and Network-
attached Storage (NAS)
–  Not a good choice for hot and warm buckets (db)
defaultdb
–  Suitable for cold buckets (colddb)
–  searches over long time periods that utilize colddb
will be slower
• High performance SAN can be used in
environments with high speed networking and db

(hot and warm buckets)
colddb

(cold buckets)

low latency
• IOPS and reliability are the definitive factors
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 59 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Storage Requirements
• The disk storage needed on an indexer is primarily determined by the size
of all the indexes, plus base storage for the OS and configuration files
• Additional disk storage is needed for
–  Indexer Clustering
ê  data replication
ê  discussed later in this module
–  Summarization and Acceleration
ê  data models
ê  report acceleration
ê  summary indexing
ê  discussed in the Search Considerations module
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 60 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Virtualizing Splunk
• You can virtualize any Splunk instance if you meet the minimum resource
requirements
• Tips for virtualization
–  Use locally attached volumes; direct access to a dedicated volume is best
–  Separate search head VMs from indexer VMs, as they burst CPU together
–  Ensure that sufficient resources are reserved – do not overcommit CPU or
memory for Splunk VMs
–  Expect virtualization to reduce performance by 10% or more

• Virtualization tech brief contains details


www.splunk.com/content/dam/splunk2/pdfs/technical-briefs/splunk-deploying-vmware-tech-brief.pdf

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 61 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Component Types – Search Head
• The search head handles search
management functions
–  Directs search requests to a set of search Search Head

peers and then…


–  Federates or merges the results and Requests Responses

presents them to the user


Search Peers

Indexers

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 62 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Search Head – Reference Server
• 1 per 10 - 12 concurrent jobs / queries Search Head

–  1 active job per core

• Search heads mostly aggregate results


–  Some types of searches may create Hardware Intel 64-bit Xeon or equivalent chip
architecture
bottlenecks
CPU 4 x 4-core (16 cores total) at 2+ GHz
• Scale by adding indexers per core
Memory 16 GB RAM
• Indexers do the majority of the work Disk 4 x 300 GB, 10K RPM SAS hard disks,
configured in RAID 1+0
–  More indexers => Network Standard 1 Gb Ethernet NIC
jobs run faster => Optional 2nd NIC for Management Network

fewer concurrent jobs => OS Linux or Windows - 64-bit version

better search throughput


Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 63 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Sizing Factors for Searching
• Use cases determine search needs
–  Number of searches (concurrent & total)
–  Number of users (concurrent & total)

• Types of searches
–  Reporting, dashboard, ad-hoc

• Jobs
–  Summarization, Alerting, Reporting

• Apps may increase the search load


–  Real time and/or scheduled searches may be features of apps

• Plan for expansion as adoption grows


Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 64 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Component Types – DMC
• Distributed Management Console (DMC) DMC

–  Splunk Enterprise's monitoring tool Admin

–  Provides detailed topology and


performance information
• A specialized search head Search Head Cluster Cluster Master

–  For sizing: follow the reference server Deployment


guidelines for search heads Server

–  Should not be a production search head


Forwarders

–  Can run on a shared instance with caveats


Indexers

–  Only admins should access


http://docs.splunk.com/Documentation/Splunk/latest/DMC/DMCoverview
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 65 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Additional Component Types
License Master Allocates licensing capacity and manages licensing usage of
all instances
•  All indexers, search heads, cluster masters, deployers and
deployment servers should be slaves to a license master
•  Contacted over network frequently by license slaves
Deployment Server Manages Splunk configuration files for deployment clients
•  Operates on a "pull" model – clients "phone home" at
intervals over network
•  Can handle 2000 phone-homes/minute at default settings
Job Server A search head that only runs scheduled searches
•  Search head component sizing applies
Cluster Master Regulates the functioning of an indexer cluster
Deployer Distributes configurations to search head cluster members

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 66 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Additional Components - Sizing
CPU Memory Disk Network Factors
License Master low low low 1 Gb number of slaves
Deployment med* med* low 1 Gb number of clients and
Server number of apps + config files

*CPU & memory can spike


during downloads
Cluster Master med med low 1 Gb required for indexer cluster
Deployer low low low 1 Gb number of SHC members

•  Splunk recommends that you dedicate a host for each role


–  You can enable multiple Splunk server roles on a server with caveats
•  Cluster Master and Deployer are discussed in more detail in the Clustering section
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 67 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Staging & Testing Environments
• Part of your Splunk deployment should include a separate "sandbox" or
testing / staging environment
–  It should have the same version of Splunk as production

• Sizing the testing environment depends on the nature of your tests


Test Inputs A standalone indexer with minimal performance
and capacity
Test A minimum set of components (one Search Head,
Configurations one Indexer, one Deployment Server, …)
Test An accurate duplication of the production
Performance environment

Best Practice
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 68 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Authentication / Authorization
• LDAP (Lightweight Directory Access Protocol) / AD (Active Directory) for
authentication and group management
• SSO (Single sign-on) allows a user to log in once and gain access to all
systems without being prompted to log in again
–  Protocols: Kerberos, Security Assertion Markup Language (SAML)
–  Requires a proxy with trusted connection
–  Splunk Professional Services help may be required to install

• In scripted authentication, a user-generated Python script serves as the


middleman between the Splunk server and an external authentication
system such as PAM or RADIUS
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 69 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Access Control
• Integrate authentication / access control with LDAP and Active Directory
• Map LDAP and AD groups to flexible Splunk capability or data-access roles
–  Define any search as a filter

LDAP / AD Groups Splunk Roles Splunk Capabilities and Filters

Manage Indexes
Manage Users
Share Searches
Save Searches
ERP App Access
PCI Data Access

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 70 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Security, Privacy, Integrity Measures
• HTTPS transport is available end-to-end
–  Create your own certificates or acquire from a valid 3rd party
–  Distributed search (enabled by default between search head and indexer)
–  Forwarder to indexer over TCP
–  Web browser access to Splunk Web
docs.splunk.com/Documentation/Splunk/latest/Security/
AboutsecuringyourSplunkconfigurationwithSSL
• Indexer Acknowledgement
• Index Splunk’s configurations and logs to track changes
• Event auditing
docs.splunk.com/Documentation/Splunk/latest/Security/AuditSplunkactivity
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 71 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Security, Privacy, Integrity Measures (cont.)
• Disable Splunk Web when it is not required
–  For example, indexers in a distributed deployment do not require Splunk Web

• Always change the default password


–  This is often overlooked on forwarders

• Hardening Splunk host


–  Follow best practices for hardening your operating system
–  docs.splunk.com/Documentation/Splunk/latest/Security/
WhatyoucansecurewithSplunk
–  benchmarks.cisecurity.org/downloads/

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 72 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Clustering

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 73 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Clustering Overview
Splunk offers two different, independent clustering capabilities
• Indexer clustering
–  replicates data (buckets) based on configured policies
–  provides increased data integrity and availability

• Search head clustering


–  replicates knowledge objects and search artifacts across a set of search heads
–  provides increased search accessibility and improved resource scheduling

• Discussed in detail in Splunk Cluster Administration class

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
74 Listen
Listentotoyour
yourdata.
data. 74 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
About Indexer Clusters
• Indexer clusters – A group of Splunk Enterprise nodes that, working in
concert, provide a redundant indexing and searching capability
–  docs.splunk.com/Documentation/Splunk/latest/Indexer/Aboutclusters

• Multisite indexer cluster – An indexer cluster that spans multiple sites, such
as data centers
–  Each site has its own set of peer nodes and search heads
–  Each site also obeys site-specific replication and search factor rules
–  docs.splunk.com/Documentation/Splunk/latest/Indexer/Multisiteclusters
–  The cluster administrator defines the "sites"
ê  can represent a physical location (city, data center, rack) or something else

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 75 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Single-site Indexer Cluster Overview
•  Cluster Master Index Replication cluster
–  Controls and manages index replication Search Head Cluster Master Distributed
–  There can only be one master search

•  Peer node
–  Indexes data from inputs/forwarders Data replication
–  Replicates data to other peer nodes as Peer Nodes
instructed by the master
•  Search head
–  Works the same as any Splunk search head
–  Required component of indexer cluster

•  Forwarders
Forwarders with autoLB & useACK enabled
–  Send data to peer nodes
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 76 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Multisite Indexer Cluster
•  Allows for an extra layer of data partitioning
–  Indexers are grouped in "sites"
(defined by the Splunk Admin)
•  Multisite clusters offer two key benefits:
Replicate
–  Disaster recovery HQ (site1) NY (site2)

ê  stores index copies at multiple sites


ê  provides automatic site-failover capability
UK (site3) Cluster JP (site4)
ê  In a disaster, indexing and searching continue Master
on the surviving sites
–  Search affinity
ê  limits searches to assigned site as much as
possible
Multi-site Cluster
ê  greatly reduces WAN network traffic
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 77 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Cluster Master
• Cluster Master manages an indexer cluster
–  Coordinates the replicating activities of the peer nodes
–  Tells search heads where to find data
–  Manages the configuration of peer nodes
–  Orchestrates remedial activities if a peer becomes unavailable

• There can be only one cluster master, even in a multisite cluster


–  A stand-by cluster master should exist offline for manual cluster master
failover
–  Note that the cluster will continue to operate while the cluster master is offline

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 78 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Replication and Search Factors
•  Peer nodes copy buckets to each other (replication)
–  The copied buckets may be complete buckets or contain only rawdata
–  The replication factors apply to the entire cluster
–  Replication must be enabled for each index
ê  the default is not to replicate indexes
•  Replication factor (RF)
–  Specifies how many total copies of rawdata the cluster should maintain
–  This sets the total failure tolerance level
•  Search factor (SF)
–  Specifies how many copies will be searchable
ê  searchable buckets have both rawdata and index files
–  Determines how quickly you can recover the search capability
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 79 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Indexer Clustering Requirements
• A cluster master
–  Plus a "stand-by" cluster master, powered off

• Indexers
–  Single site mode
ê  at least "replication factor" peer nodes in single-site mode
–  that is, for a replication factor (RF) of 3, three peer nodes are required
ê  best practice: at least RF + 1 nodes as a minimum
–  Multisite mode
ê  at least 2 peer nodes per site in multi-site mode

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 80 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Sizing Guidelines
• Peer Nodes
–  Peer nodes should meet the minimum reference server requirements
–  Replication and search factors affect the peer nodes
ê  CPU and network load will be affected by larger factors
ê  Disk sizing is directly affected by replication and search factors
• Cluster Master
–  No published minimums
–  Must have sufficient CPU and network resources to service requests

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 81 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Storage Requirements for Clustering
• With index clustering, you must consider the replication factor (RF) and
search factor (SF) to arrive at total storage requirements
–  Total rawdata disk usage = rawdata total * RF
–  Total index disk usage = index data total * SF

• NOTE: This does not include disk space needed to rebuild search factor if
required
–  docs.splunk.com/Documentation/Splunk/latest/Indexer/
Systemrequirements#Storage_considerations

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 82 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Estimating Disk Usage
• For this example, assume
–  rawdata on disk = ~ 15% of daily index data
–  index files on disk = ~35% of daily index data

• Note that the per day storage must still be multiplied by the # days retention!
Daily Index data = 100GB RF=3 & SF=2 RF=3 & SF=2 RF=3 & SF=3
on 3 peer nodes on 6 peer nodes on 6 peer nodes
rawdata (15 GB daily) 15 * 3 = 45 GB 15 * 3 = 45 GB 15 * 3 = 45 GB
index files (35 GB daily) 35 * 2 = 70 GB 35 * 2 = 70 GB 35 * 3 = 105 GB
Total size across cluster 115 GB 115 GB 150 GB
Per Peer storage per day 115 / 3 = 38.3 GB 115 / 6 = 19 GB 150 / 6 = 25 GB

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 83 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Peer Failure Impact on Disk Sizing
• When a peer is lost, the 4 peers, replication factor = 3, search factor = 2 Data replication
P Primary (Origin) B Searchable backup R Rawdata-only
cluster master will initiate
additional replication Peer A Peer B Peer C Peer D

across the surviving peers


to return to specified RF 1P 1P 1B 1R
and SF 2P 2B 2R 2P

• This will consume 3P 3R 3P 3B

additional disk space 4P 4P 4B 4R


5P 5B 5R 5P
–  Plan for additional
disk space for
sustained outage
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 84 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Search Head Clusters
• A search head cluster is a group of Splunk Enterprise search heads that
serves as a resource for searching
• You can run the same searches, view the same dashboards, and access
the same search results from any search head in the cluster
• To achieve this, the search heads in the cluster share
–  Configurations and apps
–  Search artifacts
–  Job scheduling

docs.splunk.com/Documentation/Splunk/latest/DistSearch/AboutSHC
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 85 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Search Scaling: Do you need a cluster?
• To mitigate performance issues as the search load increases
–  Distribute search load
ê  optimize scheduled searches to run on non-overlapping time slots
ê  use job servers to isolate scheduled searches, real-time searches, and ad-hoc
searches
–  Limit resource usage
ê  limit the time range of end-user searches
ê  configure user roles to limit the number of concurrent real-time searches
–  Increase the number of peer nodes (indexers)
–  Add more search heads
–  Use a search head cluster
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 86 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Search Head Clustering
• Benefits
–  Simple horizontal scaling Search Head External Load Balancer
Cluster
–  Always-on search services
–  Reliable alerts
–  Consistent knowledge objects across cluster
–  Eliminates the need for Job Server
Deployer
–  No more NFS!
–  Configuration sharing and job artifacts use
local storage via automatic replication

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 87 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Search Head Clustering Requirements
• Requirements
–  At least 3 search heads
–  A deployer

• Sizing guidelines – Search Heads


–  Search heads should meet the minimum reference server requirements

• Sizing guidelines – Deployer


–  No published minimums
–  Must have sufficient CPU and network resources to service requests and to push
configurations

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 88 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Notes for Search Head Clustering
• Summary indexes must be forwarded to the indexer tier
–  This is a general best practice, but required for search head clustering
–  This may increase the disk space required on the indexers

• In search head cluster, add more search heads to horizontally scale the
capacity
–  This assumes that the indexer layer has sufficient search capacity
–  If capacity is insufficient, this will create a bottleneck at the indexers/peer nodes
–  Remember that the rule of thumb for scaling is "add another indexer"

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 89 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Final Notes on Clustering
• Planning is essential
–  What are the requirements for high availability and disaster recovery?
–  How quickly must recovery occur?
–  How many servers can be lost before service degrades or data is lost?

• You can mix clustered and non-clustered servers


–  You can have a mix of clustered and standalone indexers
–  You can have a mix of clustered and standalone search heads
ê  this may be the best solution for some special-purpose or single app search heads

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 90 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module 4 Lab Exercise
Time: 10-15 minutes
• Task: Size the infrastructure
–  How many indexers, search heads and other components?

• Challenge: Indexer Clustering


–  What if data replication for ecommerce data is desired?

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 91 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module 5:
Data Collection

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 92 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module Objectives
• Review logging basics
• Understand the difference between Infrastructure

agent and agentless data collection


• Compare data collection methods Integra'on Collec'on

Opera'on

Querying Comprehension

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 93 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
What Splunk Can Index
Splunk is the platform for machine data
Customer-Facing

Data Outside the 

Data Center

  Manufacturing
  Click-stream data   CDRs & IPDRs
  Shopping cart data   Power consumption
  Online transaction data   RFID data
Logfiles Configs Messages Alerts Metrics Scripts Changes Tickets   GPS data

Virtualization
Linux / Unix And Cloud Applications Databases Networking
Windows

  Registry   Configurations   Hypervisor   Web logs   Configurations   syslog


  Event logs   syslog   Guest OS, Apps   Log4J, JMS, JMX   Audit / query logs   SNMP
  File system   File system   Cloud   .NET events   Tables   Netflow
  sysinternals   ps, iostat, top   Code and scripts   Schemas   IDS

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 94 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Logging Components
Messages

Firewalls

Splunk Forwarder

(Relay) Events Indexer(s)

Switches

Linux Machine
Network

Without Forwarder
Inputs

https
(Originators)
Events

Network

Devices
Windows Machine Linux Machine
With Forwarder
 With Forwarder

(Agent) (Agent)

Production
Mainframe Application

Application Events via



sends events SNMP

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 95 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Data collection: agentless vs. agent
Agentless Agent
• Can use a variety of transports • Software program (service or
–  TCP, UDP, SNMP, HTTP daemon) that collects information
• Splunk agentless options and pushes it over the network to a
central location
–  HTTP data collection
–  Splunk App for Stream
• Splunk agents
–  Direct monitoring of TCP/UPD –  Universal Forwarder
ports –  Heavy Forwarder

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 96 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Data Collection Overview
•  File based (monitor) inputs
–  Using a shared file system as a log collector
These alternatives
–  Using Splunk forwarder on the log collector are discussed in
•  Network inputs detail on the
–  Directly connected to single indexer following slides
–  Collected on syslog-ng / rsyslog server with Splunk forwarder
•  Windows events and perfmon (WMI)
–  Local data collection with forwarder
–  Remote data collection
•  Special data collection options
–  Splunk App for Stream
–  HTTP Event Collector

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 97 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Alternatives for File Inputs – Log Collector
• Share log directory on Production Servers
using a Network File Server of your choice Single
Indexer

(agentless)
• Have Splunk indexer monitor these as local Local Directory 

Monitoring Over

directories over the network


Network

• Does not scale beyond a single indexer Network File Server

Production Servers

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 98 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Alternatives for File Inputs – Mixed

• Mixed or transitional alternative Indexers

• Forwarder sends data from the "log


repository" to Splunk indexer(s)

Network File Server



with Forwarder Installed

Production Servers

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 99 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Alternatives for File Inputs – Forwarding
• Universal Forwarders installed on Indexers
production servers
• Advantages
–  Load balancing and buffering
–  Supports scripted inputs
–  More scalable
–  Better performance Automatic load balancing
configured on all forwarders

–  More secure

Production Servers with Forwarders Installed Best Practice


Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 100 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Alternatives for Network Inputs – Direct to Indexer
• Use network input to send directly to Splunk indexer
–  What happens when Splunk needs to be stopped for maintenance?
–  What happens if you must clean or rebuild an index?
–  What happens if the indexer becomes very busy?
–  What if not running Splunk as root?
• Does not scale beyond a single indexer
• Not recommended
Single
Network Devices Indexer

TCP / UDP

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 101 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Alternatives for Network Inputs – syslog
• Input to syslog, then Splunk monitors syslog output file
–  Provides a buffering mechanism
• Can take advantage of syslog-ng / rsyslog filtering features
• A forwarder is installed on the syslog-ng / rsyslog server
–  Provides the advantage of load balancing
•  Offers greater resiliency and increases scalability
Forwarder installed on

Network Devices syslog-ng / rsyslog Indexers

Best Practice
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 102 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Alternatives for Windows Inputs – WMI Collector
• A Splunk forwarder running in a Windows forwarder performs
remote Event Log / WMI monitoring
• Will not scale horizontally to many production servers
• Can only monitor limited Windows inputs
Windows Heavy 

Production Windows Servers Forwarder
Indexers

WMI

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 103 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Alternatives for Windows Inputs – Forwarder
• Each Windows production server has a Universal Forwarder installed
–  Collecting performance monitoring locally and forwarding to indexer
–  Scalable
–  More secure and does not require domain privileges
–  Can collect any input, not just Windows remote inputs

Forwarders on Production Windows Servers Indexers

Best Practice
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 104 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Forwarding from Windows Systems
• Universal forwarder
(local agent)
Universal Remote Collector
–  Offers many types of data sources Source
Forwarder (WMI)
–  Minimizes Windows Event Logs Yes Yes
network overhead (must know name of
event log)
–  Reduces operational risk
Performance Yes Yes
• Remote polling Registry Yes No
(agentless)
–  Use native WMI API to collect
event logs and performance data

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 105 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Windows Forwarding – Method Comparison
Local Forwarder Remote Collector (WMI)
Performance • Requires less CPU • Requires more CPU
• Performs basic data • More network intensive
compression / encryption • Not recommended for highly audited hosts
• More memory-intensive (such as domain controllers)
Deployment •  Easier if you have control of • Good for limited set of data
base OS build or many data • May be only option without build control or
sources admin rights on target host
•  Can use deployment server
Possible •  UF is limited to 256 kbps of • Starts to poll less frequently over time if
concerns network bandwidth unable to contact a host for a period of time
(to prevent potential DOS attack)
• Not advised for machines that are
frequently offline

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 106 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
HTTP Event Collector
• Component of Splunk Enterprise since 6.3
–  Can be configured for high-volume inputs with multiple collection points
(indexers or heavy forwarders)
http://dev.splunk.com/view/event-collector/SP-CAAAE73

• A token-based HTTP input that is secure and scalable


• Sends events to Splunk without the use of forwarders
–  Can facilitate logging from distributed, multi-modal, and/or legacy environments
–  Log data from a web browser, automation scripts, or mobile apps

• Discussed in Production https Indexer or

Splunk Administration Server Application Heavy Forwarder


Application
sends events

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 107 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Splunk App for Stream
• Captures real-time streaming wire
data from anywhere in your
datacenter
• Filter and aggregate wire data
easily and automatically into
Splunk
• https://apps.splunk.com/app/1809/

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 108 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Comparison of Remote Collection Options
Best Universal •  Reliable delivery, encryption, compression
Forwarders •  Distribution and failover over multiple Splunk indexers
(UF) •  Can execute scripts and collect program output remotely
•  Defaults to 256 kbps of network bandwidth (configurable)
HTTP Event •  Secure and scalable
Collector •  Requires developers to code the HTTP post that will be indexed
Good Heavy •  Same as UF except parses data before forwarding & no bandwidth limits
Forwarders •  More load on source system
Not Optimal NFS / SMB •  Not as scalable
•  Possible latency with many files
syslog •  Source info not as fine-grained
•  Use rsyslog / syslog-ng instead for capture buffering, hostname extraction
╸  Makes TZ processing in Splunk better – no need to transform host
•  Use a forwarder to read syslog files and distribute over indexers
Other [Including remote WMI]
Agentless •  Not as scalable
•  May be less secure
Not Recommended FTP, Batch •  Not real-time
scripts •  Can be hard to maintain and manage
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 109 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Managing Initial Data Collection
•  Production servers often have large volumes of
historical data /var/log 1
Clean directory

–  Can cause license violations when initially indexing


–  May not be worth collecting 2 Move historical data

•  Before starting to index


–  Archive, then delete old data that should 3
Start monitoring /var
original directory
not be indexed 1 /old_log 4 Gradually batch or upload 

historical files to index

–  Place historical data in a separate location


(or "blacklist") 2
•  Index recent and current data 3
•  Batch upload historical data as needed 4
•  Set up a regular schedule for rotating, archiving or
removing older log files after they are indexed Best Practice
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 110 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Logging Best Practices
• Human-readable
–  Key-value pairs •  For more information
–  Text format –  Splunk Logging Best Practices
http://dev.splunk.com/view/logging-best-practices/
• Traceable SP-CAAADP6
–  Timestamps for every event
–  NIST – Guide to Computer Security Log
–  Use unique identifiers Management
–  Identify the source csrc.nist.gov/publications/nistpubs/800-92/
• Collect events at multiple levels SP800-92.pdf
–  Infrastructure status
–  Application status
–  Semantic events
ê  for example: web clicks, financial trades, cell phone connections and Best Practice
dropped calls, audit trails
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 111 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Data Collection Best Practices
• Splunk works better with smaller log files
–  Use the size of the log instead of time to roll logs more frequently

• Log rotation
–  Keep the latest two logs in text format before you compress, otherwise you may
be missing delayed events
• Use blacklist to eliminate compressed files from monitor input
–  This prevents duplication of events

• Log to a local file system

Best Practice
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 112 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module 5 Lab Exercise
Time: 30 minutes
• Create a topology diagram for the environment

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 113 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module 6:
Forwarders and Deployment
Management

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 114 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module Objectives
• Review forwarder types
• Understand how to manage forwarder Infrastructure

installation in an enterprise
environment
Integra'on Collec'on
• Understand configuration of all Splunk
components using Opera'on

–  Deployment Server
–  Cluster Master
–  Deployer Querying Comprehension

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 115 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Types of Forwarders
• Universal forwarder
–  A streamlined binary package that contains only the components
needed to forward data
–  Use the universal forwarder whenever possible
ê  Smaller, more efficient
ê  Network bandwidth defaults to 256 kbps (configurable)
• Heavy Forwarder
–  Uses the Splunk Enterprise binary, with all capabilities
–  Parses data before forwarding, so use a heavy forwarder when
ê  routing events
ê  filtering more than 80% of incoming events
ê  anonymizing or masking data before forwarding to indexer
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 116 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Forwarder – Scripted Installation
• Use an automated method to install Splunk forwarders
–  Scripted installation
–  Another server management / configuration management tool

• Set the forwarder as a deployment client as part of the installation


• Sample scripts
–  answers.splunk.com/answers/34896/simple-installation-script-for-universal-forwarder
–  answers.splunk.com/answers/100989/forwarder-installation-script
–  docs.splunk.com/Documentation/Splunk/latest/Forwarding/
Remotelydeployanixdfwithastaticconfiguration

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 117 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Splunk Deployment Tools
Search heads

Search-time
configurations
Deployer Indexers
Indexing &
App parsing
configurations
Cluster Master

Input-time
configurations
Deployment
Server

Forwarders

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 118 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Deployment Management
Type of Instance Manage Configurations with
Forwarder Deployment Server or other
Stand-alone Search Head configuration management tool
Stand-alone Indexer
Search Head Cluster Member Deployer only
Peer Node in Indexer Cluster Cluster Master only

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 119 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Deployment Server
• A centralized configuration manager that Deployment

Server
delivers updated content to deployment clients
Deployment

–  Units of content are known as deployment apps Apps

–  Do NOT store configuration in Indexers (not in a cluster)

$SPLUNK_HOME/etc/system/local
on clients
ê  system-level configurations on clients cannot be
over-ridden with Deployment Server Forwarders

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 120 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Deployment Server – Rules of Thumb
• One reference server safely handles approximately 2000+ polls/minute
• Be sensitive to the phoneHomeIntervalInSecs attribute in
deploymentclient.conf
–  Default client poll is 60 seconds
–  Prioritize servers that need more frequent updates Deployment

Server
–  In general, adjust client polling interval to scale

• For more than 50 deployment clients


–  The deployment server should be on its own server
ê  otherwise, it can share a server with another component
docs.splunk.com/Documentation/Splunk/latest/Updating/Calculatedeploymentserverperformance

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 121 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Config Management – Deployment Server
• When using Forwarder Management or the Deployment Server (DS)
–  Build an install script/package for clients with only the files needed to contact the
DS (basic installation + deploymentclient.conf)
–  Clients will get the rest of the configuration information from the DS

• Use in combination with another configuration management (CM) tool for


–  Authentication certificates, passwords, log-local.cfg,
deploymentclient.conf, upgrades of forwarder software, remote boot-start
of production system
–  Use other CM tools for rare tasks, deployment server for routine tasks

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 122 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Deploy Apps to Search Head Cluster
• Use the Deployer to distribute apps to SHC members
–  DO NOT stage any out-of-the-box apps in deployer (search, launcher, etc.)
–  All members of the cluster will receive the same apps

• You cannot use the Deployment Server to distribute apps to SHC members
• Deployer supports both push and pull mechanisms
–  Can push apps to search head cluster members
–  Can be polled by new or restarted search head cluster members for updates

• Knowledge Objects that are created by users in Splunk Web


–  are automatically replicated
–  are not sent to the Deployer
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 123 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Deploy Apps to Indexer Cluster
• Use the Cluster Master to distribute apps to the peer nodes
–  DO NOT stage any out-of-the-box apps in Cluster Master (search, launcher, etc.)
–  All members of the cluster will receive the same apps
–  Cluster Master pushes apps and rolling-restarts peer nodes if needed
–  Cluster master repository: /etc/master-apps
–  Destination directory on indexers: /etc/slave-apps

• You cannot use the Deployment Server to distribute apps to the peer nodes

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 124 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Deployment Apps – Best Practices
• Design your deployment app
–  An app is a set of deployment content (a configuration bundle)
–  An app is deployed as a unit and should be small
–  Take advantage of Splunk’s configuration layering
–  Use a naming convention for the apps
–  Create classes of apps, for example
ê  input apps
ê  index apps
ê  web control apps
• Carefully design apps regardless of your configuration Best Practice
management tool (DS, cluster master, Puppet, etc.)
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 125 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module 6 Lab Exercise
Time: 15 minutes
• Task: Analyze an app
–  Remember that some Splunkbase apps may need to be repackaged into
deployment apps for your environment
• Task: Design a test environment
–  How many servers are needed for testing?

• How will you test the deployment, including the apps?


• Challenge
–  Update the topology to reflect Phase 2 of the deployment

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 126 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module 7:
Data Comprehension

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 127 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module Objectives
• Identify the 6 things that you must get
right at index time Infrastructure

• Review knowledge objects


• Review the Common Information Integra'on Collec'on
Model (CIM) and data normalization
Opera'on
• Discuss data model design

Querying Comprehension

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 128 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Laying the Foundation During Index Time
Six things you must get correct at index time
Input Phase Parsing Phase => Index Time
Date /
timestamp no Yes
extraction
Event
no Yes
boundary
host
sourcetype Yes, optimally set
Yes, can override input setting
source during input
index

This metadata cannot be changed after the events are written to the index
docs.splunk.com/Documentation/Splunk/latest/Admin/Configurationparametersandthedatapipeline
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 129 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Data Comprehension Tasks
• Oversight of knowledge object creation and usage across teams,
departments, and deployments
• Normalization of event data
• Creation of data models
• Management of shared knowledge objects
• Development of naming conventions

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 130 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Data Enrichment – Splunk Knowledge
Data interpretation Fields and field •  The first order of Splunk Enterprise knowledge
extractions •  Automatic field extraction brings meaning to
your raw data
•  Fields that are extracted manually expand and
improve upon this layer of meaning
Data classification Event types and •  Event types group sets of events that have
transactions common characteristics
•  Transactions are collections of conceptually-
related events
Data enrichment Lookups and •  Enrich your data with external resources
workflow actions •  Add fields to data from external data sources
•  Workflow actions enable interactions with other
applications or web resources

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 131 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Data Enrichment – Splunk Knowledge (cont.)
Data Tags and aliases •  Normalize your data
normalization •  Group a set of related field values together
•  Apply aliases to fields with the same meaning,
but different names
•  Give extracted fields tags that reflect different
aspects of their identity
Data models Data models •  Representations of one or more datasets
•  Drive Pivot tool
•  Designed by knowledge managers who fully
understand the format and semantics of their
indexed data
•  Can be used with other knowledge objects
including lookups, transactions, search-time field
extractions, and calculated fields

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 132 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Lookups
• Four built-in types
–  File-based uses a csv file, stored in the lookups directory
ê  preferred for small tables that change infrequently
–  KV Store requires collections.conf that defines fields
ê  uses mongodb to store the data
ê  can be used to store state information and individual entries can be updated
ê  preferred for large tables or data that needs to be regularly updated
–  External uses a python script or a executable in the bin directory
–  Geospatial uses a kmz saved in the lookups directory to support the choropleth
visualization
• Splunk DBConnect
–  Can lookup data in a relational database
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 133 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Data Enrichment – Splunk DB Connect
MS SQL Other
Oracle DB Server Databases

• Provides a way to enrich and combine event data


with database data
–  Easily configure database queries using the Splunk
user interface JDBC

–  Create lookups based on database query results


–  Import and index data already stored in your
Java Bridge Server
database Database Connection
 Database

docs.splunk.com/Documentation/DBX/latest/ Lookup Pooling Query

DeployDBX/AboutSplunkDBConnect

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 134 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Normalizing Data
• Events from different products and vendors have unique formats and fields
–  Even when the events are semantically equivalent

• Most traditional approaches to providing a unified view are based on


normalizing the data into a common schema at input time
• Splunk maps data to a common set of field names and tags at search time
–  These can be defined at any time after the data has been captured, indexed, and
available for ad hoc search
–  This improves flexibility by normalizing with late-binding searches instead of index-
time parsing

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 135 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Data Normalization – Example
This example illustrates normalized field names across two separate data
sources
Data Source 1

Normalized Field Names

Data Source 2

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 136 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Common Information Model
• The Splunk Common Information Model (CIM)
–  Defines relationships in the underlying data using tags and fields
–  Provides a standard method of parsing, categorizing, and normalizing event data
–  Leaves the raw machine data intact
–  Defines the least common denominator for a domain of interest

• Applications that use the CIM will integrate more easily


• Splunk CIM is implemented as a free add-on
–  Includes data models
–  Has domain coverage for security, inventory, performance, networking and more
–  docs.splunk.com/Documentation/CIM/latest/User/Overview

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 137 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
CIM Version For Apps

• Apps that use the CIM will indicate


it on the sidebar
–  Check the version to ensure that
apps will integrate easily

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 138 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Data Models
A data model provides a hierarchical mapping of semantic knowledge about
one or more datasets
Pivot Data
Index sourcetype
Editor Model host
source
customer customer fields
order order Index tags
shipment shipment eventtypes
transaction transaction etc.
End user etc. etc.
Index

docs.splunk.com/Documentation/Splunk/latest/Knowledge/Aboutdatamodels
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 139 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Data Model Design
• Good data model design requires
–  An understanding of the data indexed in Splunk
ê  indexes, fields, sourcetypes, etc.
ê  understanding of the data content
–  Appropriate fields and knowledge objects to enrich the data
–  An understanding of the user needs
ê  what are the questions that the user wants to answer?
ê  what are the parameters of the model – from the user or business point of view?
• Simply exposing the existing sourcetypes and fields to the user is generally
not sufficient

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 140 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module 7 Lab Exercise
Time: 15 minutes
• Task: Identify sourcetypes
–  Which inputs have pre-trained sourcetypes?
–  Which inputs are defined by apps?

• Challenge
–  Examine fields in the CIM for network traffic

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 141 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module 8:
Search Considerations

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 142 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module Objectives
• Understand MapReduce
• Discuss search performance Infrastructure

• Understand how scheduled reports are


dispatched Integra'on Collec'on

• Discuss differences between data Opera'on


summary methods

Querying Comprehension

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 143 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Components – MapReduce
• A framework for processing parallelizable
1
problems across huge datasets using a
large number of computers (nodes) in a 2
cluster
en.wikipedia.org/wiki/MapReduce 3

• Splunk search uses this technology


component


n

Source: Wikipedia
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 144 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Performance Considerations for MapReduce
• Which searches will MapReduce well? Which do not?
–  This is determined by how much of the work can be done on the indexers vs.
how much manipulation of the search results must be done by the search head
Event searches and Reporting searches
non-distributable commands
Search Head Search Head
•  Indexers retrieve events in •  Indexers can perform
parallel event retrieval and
•  Efficient for event searches additional steps in
•  For other searches parallel
-  Additional steps, if •  Intermediate results
Intermediate
Events Events needed, must be done Results are returned
on the search head •  The search head
-  CPU & memory combines and
bottlenecks can occur presents the results
on the search head
Indexer Indexer Indexer Indexer
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 145 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Questions for Search Types and Volume
What types of searches •  Free-text search
do you expect? •  Search by fields
•  Complex correlations and transactional searches
What types of alerts do •  Alerts and notifications of specific events
you expect? •  Alerts and notifications of thresholds reached or changed
•  Alerts and notifications of complex thresholds
Reporting •  How frequently will the reports run?
•  How current must the report data be? Near real time, to an hour, to
the previous day?
•  Will historical or retroactive reports be required, or only current
reports?
•  How will the reports be delivered to users? Via email, web interface,
or other method?
•  Are there users who do not have access to the raw data who must
see the reports?
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 146 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Raw Event Searches
• These search queries contain only the search command
–  The results are usually a list of raw events
–  Usually categorized as super-sparse or rare type searches

• A typical use case would be the deep analysis


Search Head

of a specific problem or incident


• Examples include:
–  Checking error codes Events Events

–  Correlating events
–  Investigating security issues
Indexer Indexer
–  Analyzing failures

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 147 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Report Generating Searches
• These search queries consist of initial search followed by some statistical
calculation
Search Head
–  The result is usually a table, graph or a single value
–  Often categorized as dense or sparse type searches

• They always require fields and at least one Intermediate


statistical command Results

• Examples include:
–  Compiling a daily count of error events Indexer Indexer

–  Counting the number of times a specific user has logged in


–  Calculating the 95th percentile of field values

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 148 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Search Performance – Types of Searches
Type of search Description Indexer throughput Impact
A large percentage of the data Up to 50K matching
Dense CPU bound
matches the search events per second
A small percentage of data Up to 5K matching
Sparse CPU bound
matches the search events per second
A "needle in a haystack" search
Indexer must check all buckets to
Up to 2 seconds per Primarily I/O
Super-sparse find results
bucket bound
Time consuming for large numbers
of buckets
Similar to super-sparse searches,
10 – 15 buckets per Primarily I/O
Rare but bloom filters eliminate buckets
second bound
that don't include search results

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 149 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Improving Search Performance
• Best practice
–  Make sure disk I/O is as good as you can get
–  Increase CPU hardware only if needed

• Most search performance issues can be addressed by adding additional


search peers (indexers)
• Look at resource consumption on both the indexer tier and search head tier
to diagnose slow searches

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 150 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Improving Performance with limits.conf
• If you have unused CPU/memory resources,
you can set multiple search pipelines
• Indexed real-time significantly limits.conf on indexer
improves performance if there [search]
are many real-time searches batch_search_max_pipeline = 2

–  Normally, real-time searches [realtime]


read the indexing queue indexed_realtime_use_by_default = true
indexed_realtime_disk_sync_delay = 60
–  Indexed real-time causes the indexed_realtime_default_span = 1
indexer to read the indexes on indexed_realtime_maximum_span = 0
disk and collect events as
they arrive in the hot buckets
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 151 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Knowledge Bundles
• When initiating a distributed search, the search Search Head

head replicates its knowledge objects (KOs) in the


form of a knowledge bundle to its search peers
–  Therefore, indexers may receive nearly the entire
Knowledge

contents of all the search head's apps Bundles

• If an app contains large files that do not need to be


shared with the indexers, use the
[replicationSettings:refineConf] stanza in
Indexer Indexer
distsearch.conf to reduce the size of the bundle
• But be careful not to eliminate needed KOs!
http://docs.splunk.com/Documentation/Splunk/latest/DistSearch/Limittheknowledgebundlesize
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 152 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Splunk Report Scheduler Behavior
• Splunk runs as many concurrent jobs as it can, based on limits.conf
–  On a 16 CPU search head, the default is 11 concurrent scheduled searches

• In the example, we use 4 concurrent scheduled searches


–  Jobs S1, S2, S3, and S4 take the first 4 slots
–  Job S5 must wait until another job finishes (S4)

Time
Search Slots = 4
S1 ✔ ✔ Run
S2 ✔ X Deferred
S3 ✔
S4 ✔
S5 xxxx ✔ ✔ ✔ ✔

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 153 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
How a scheduled job gets skipped
Window 1 Window 2

Search Slots = 4 Time

S1 ✔ ✔
S2 ✔ ✔
S3 ✔ ✔ ✔ Run
S4 ✔ ✔ x Deferred
S5 xxxx ✔ ✔ ✔ ✔ xxxxxX ✔ X Skipped
• Every scheduled job has a window – a time period where it should run
• A job is deferred if it can’t start when scheduled
–  A deferred job is retried (repeatedly for the duration of its window)
–  Example: Job S5 is deferred 4 times, but then runs during Window 1

• A job is skipped if it cannot start in its window


–  Example: Job S5 never runs during Window 2, and so is skipped
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 154 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Report Scheduler Priorities
• The scheduler calculates priorities so that a skipped job will have a higher
priority in its next scheduled window
• To identify if Splunk has skipped searches, run the following
index=_internal source=*scheduler.log status!=success
| stats count by user app savedsearch_name status

https://wiki.splunk.com/Community:TroubleshootingSearchQuotas

Priority = Next_runtime
+ Avg_runtime x priority_runtime_factor
- skipped_count x period x priority_skipped_factor
+ window_adjustment

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 155 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Summarization and Time Value of Data
• The value of data can be broken into two curves
–  Value of and individual data item
–  Aggregate data value

• In most cases, the value of an individual data item will decline over time and
the aggregate data value will increase in value

Value of Individual Aggregate Data


Data Item Value

Data Value
Age of Data

Time
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 156 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Acceleration
Report •  Accelerates individual reports
Acceleration •  Uses automatically-created summaries to speed completion times for qualified reports
•  Easier to create than summary indexes and backfills automatically
•  Data is stored on the indexers, alongside the related buckets
•  Depending on the defined time span, periodically ages out data
•  Always ages data when the corresponding bucket ages out
•  Cannot create a data cube and report on smaller subsets

Summary •  Accelerates reports that don't qualify for report acceleration


Indexing •  Uses manually created summary indexes that exist separate from main indexes
•  Data is stored on the search head, but can be forwarded to the indexer tier
•  Can persist after events have been frozen by controlling retention period or index size
•  Backfill is a manual (scripted) process

Data Model •  Accelerates all of the fields defined in a data model


Acceleration •  Uses automatically-created summaries to speed completion times for pivots
•  Takes the form of time-series index (.tsidx) files, which are created on the indexer

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 157 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
How Acceleration Works
• Report acceleration and summary indexing speed up individual searches,
on a report-by-report basis
–  This is accomplished by building collections of pre-computed search result
aggregates
• Data model acceleration speeds up reporting for the specific set of attributes
(fields) that you define in a data model
–  Creates summaries for the specific set of fields, accelerating the dataset
represented by that collection of fields rather than a particular search

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 158 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Data Model Acceleration
• Used to speed up retrieval of the events that underlie a data model
• Speeds up reporting for the entire set of attributes (fields)
• Updated every five minutes
• Affects only event object hierarchies
–  Hierarchies based on search or transaction objects cannot be accelerated

• Most efficient if the root event objects include the index(es) in their
initial constraint search

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 159 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Data Model Acceleration
• The High Performance Analytics Store
–  Consists of time-series index files, with the .tsidx file extension
–  Exists on the indexer tier, parallel to the buckets that contain the events
referenced in the .tsidx files
–  Spans a "summary range," which is the time range selected for acceleration

• Location and size can be set in indexes.conf


–  default location is
$SPLUNK_HOME/var/lib/splunk/indexName/data_model_summary

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 160 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Disk Usage: Data Model Acceleration
• Estimated storage for data model acceleration per year =
Data volume per day * 3.4
–  Example: if indexing 250 Gb / day, the expected storage needed for acceleration:
250 * 3.4 = 850 GB across all indexers
• To see the size on disk of a particular data model acceleration,
go to the Data Models page and click Acceleration

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 161 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Report Acceleration
• Report Acceleration
–  Report results are "pre-computed" at regular intervals and stored on the
indexer tier
–  All reports that use a similar set of data and computations automatically use
the same report acceleration summaries if they can
• Report Acceleration Summaries
–  Span a time range selected for acceleration
–  Update every 10 minutes by default
–  Are automatically rebuilt if the data in the underlying bucket changes
docs.splunk.com/Documentation/Splunk/latest/Knowledge/
Manageacceleratedsearchsummaries
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 162 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Report Acceleration
• Requirements for acceleration
–  The report was not created via Pivot
–  The underlying search qualifies for acceleration

• Splunk typically won't generate a summary if:


–  There are fewer than 100K events in the summary range
ê  Faster to execute the search without a summary
–  Summary size is projected to be too large
ê  Faster to execute the search using the normal index because it is smaller
• If a summary is defined but not created for the above reasons
–  Splunk continues to check periodically
–  automatically creates a summary when/if the search meets the requirements
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 163 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Disk Sizing: Report Acceleration
• Report Acceleration Summaries
–  Exist on the indexer tier, in parallel to the buckets that contain the events

• Location and maximum size can be set in indexes.conf


–  Default location is
$SPLUNK_HOME/var/lib/splunk/indexName/summary
–  Location is specified with summaryHomePath

• Can be examined, managed and deleted from Settings > Report


Acceleration Summaries

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 164 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Summary indexing overview
• Efficiently report on large volumes of data
• Spread the cost of a computationally expensive report over time
–  Run reports over long time ranges for large datasets more efficiently
ê  Example: Show the number of page views for each of your web sites over the past
quarter, broken out by site
• Retain summarized data after original data sources have been frozen
–  Example: If you can only maintain original firewall logs for 30 days, but want
reports over a year

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 165 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
How summary indexing works
• Data is collected and summarized
–  A populating search is scheduled to run periodically
–  The populating search stores its results in a summary index

• Reports are run against the summary index


–  Searches run faster over the smaller summary index
–  The summary index may be used to report when data ages out of the original
index

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 166 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Summary indexing flow
purchasedProducts search Populating Search
sourcetype=access_combined | sistats count by product_name
Run every hour, time range 1 hour

Populating
search results
are formatted
and stored as
index statistics

Summary
index

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 167 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Summary Indexing
• Requires manual management and knowledgeable monitoring
–  Manual rebuild of summary statistics needed
ê  If populating search fails to run on schedule
ê  If data arrives late
ê  If knowledge objects change
• Summary index configuration
–  Must be defined in indexes.conf, just like a regular index
–  Retention and size are controlled in indexes.conf
–  Do not consume license (summary indexing is free)
–  Summary indexes are usually small, but allocate space on the indexer tier
ê  monitor the space used like any other index
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 168 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Forwarding Summary Indexes
• Summary indexes are created by default on the search head
• If you use summary indexes, forward your summary indexes to the
indexing layer
–  This allows all members to access them
ê  otherwise, they're only available on the search head that generates them
–  This is required for search head clustering

• Configure the search head as a forwarder


http://docs.splunk.com/Documentation/Splunk/latest/DistSearch/Forwardsearchheaddata

Best Practice
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 169 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module 8 Lab Exercise
Time: 10 minutes
• Review the list of searches. Could any searches benefit from acceleration?
• What additional disk space considerations have you uncovered?

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 170 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module 9:
Integration

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 171 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module Objectives
• Describe integration methods
• Identify common integration points Infrastructure

Integra'on Collec'on

Opera'on

Querying Comprehension

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 172 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Types of Integration
• Accessing other data or applications from within Splunk
• Accessing Splunk from another application
• Re-forwarding data to other applications after it is indexed in Splunk
• Integration with Hadoop – Hunk

Note that apps may also provide convenient integration points

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 173 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Search Extensibility Review
• Custom search commands
–  Write a new search commands in Python 2.7 Extend & Integrate Splunk Build Splunk Apps

–  http://dev.splunk.com/view/python-sdk/SP- SDKs
Web
Data Models Framework
CAAAEU2 Java
JavaScript
Python
Search
Simple XML

Extensibility
Ruby

• Scripted lookups
JavaScript
C# Modular
PHP Inputs

–  programmatically script lookups in Python REST API

• Workflow actions and custom navigation


–  access additional resources outside Splunk with a
single click

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 174 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Integration Using Alerts
• Custom alert actions
–  allow developers to extend Splunk to build, package
and publish new alert actions for Splunk
–  Some modular alerts have been published on
Splunkbase (for example, Twilio SMS Alerting)
–  http://docs.splunk.com/Documentation/Splunk/latest/
Alert/CreateCustomAlerts
–  docs.splunk.com/Documentation/Splunk/latest/
AdvancedDev/ModAlertsIntro

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 175 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Splunk REST API
• REST stands for
REpresentational State Transfer
Extend & Integrate Splunk Build Splunk Apps

• The most basic way to programmatically Web

access Splunk is to use the REST API via SDKs


Java
JavaScript
Data Models

Search
Simple XML Framework

HTTP requests
Python JavaScript
Extensibility
Ruby
C# Modular
PHP Inputs

• With over 200 endpoints, the REST API lets REST API

you interact directly with a Splunk instance


• For more information
http://dev.splunk.com/restapi

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 176 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Software Development Kits (SDKs)
• SDKs provide language-specific libraries for
accessing the Splunk REST API
Extend & Integrate Splunk Build Splunk Apps
–  Includes documentation, code samples,
resources, and tools SDKs
Java
Data Models
Web
Framework
JavaScript Simple XML

–  Simplifies code development for


Search
Python
Extensibility
Ruby JavaScript
C# Modular
ê  Python PHP Inputs

ê  JavaScript REST API

ê  Java
ê  PHP
ê  Ruby
ê  C#
http://dev.splunk.com/sdkss
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 177 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Sending Data to Other Systems
Search Head

• Forward all or a subset of data via TCP with Alert

to other systems Service Desk


–  Forward either raw text or syslog
–  Can be done centrally via the indexer Event Console
–  Does not increase license Indexer

• Use scheduled searches to insert SIEM

events into other systems


–  Leverages alert functionality
–  For example, use correlation searches
and configure the alerts to open tickets
or insert events into a SIEM Forwarders
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 178 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Hunk – Splunk for Hadoop
• Hunk provides access to data stored in a
Hadoop Distributed File System (HDFS)
• Provides the familiar Splunk user interface
• Manages the MapReduce access to
HDFS
• Accesses both Splunk indexes and HDFS
from a single Splunk search head
–  Requires both a Hunk and a Splunk
Enterprise license
–  Available since Splunk 6.3

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 179 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module 10:
Operations and Management

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 180 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module Objectives
• Identify ongoing tasks in a Splunk deployment
• Define monitoring tools Infrastructure

• Identify backup and archiving methods


• Discuss onboarding processes Integra'on Collec'on

Opera'on

Querying Comprehension

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 181 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Operations and Management
• Operations
–  Monitoring
–  Backups and archiving

• Deployment and Distributed Management


• Change Control
–  Onboarding new data sources
–  Onboarding new users
–  Operations review

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 182 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Distributed Management Console (DMC)
• Splunk environment monitor and
diagnostic tool
–  Only admins can access
–  Investigate server performance
and resource usage
–  Uses the _introspection index
populated by
introspection_generator_addon
–  Provides configurable alerts

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 183 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Monitoring Splunk: S.o.S app
• The Splunk on Splunk (S.o.S) app
is another tool for monitoring or
troubleshooting
–  View, search and compare configs
–  Detect errors and inspect crash logs
–  Measure indexing performance & event
processing
–  View details of scheduler and user activity
–  Gradually being replaced by DMC

• SOS does use Splunk license


• Prefer DMC when possible
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 184 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Operations – Backups and Archiving
• Backups and archiving are discussed in Splunk Administration and in
docs.splunk.com/Documentation/Splunk/latest/Indexer/Backupindexeddata
• Choose a data backup strategy based on your tolerance for data loss &
recovery effort
–  Incremental backup
–  Full backup
–  Indexer replication

• Remember to back up configuration files across the environment, including


–  Search heads
–  Indexers and cluster master
–  Deployer, deployment server, license master and any other Splunk instances
Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 185 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Archiving Data to Hadoop
• If you have both a Hunk license and a Splunk Enterprise license, you can
use Hadoop to archive buckets
–  When buckets move from cold to frozen, the raw data will be moved into Hadoop

http://docs.splunk.com/Documentation/Hunk/latest/Hunk/Setanarchivescript

• Once in Hadoop, the data can still be searched using Hunk

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 186 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Onboarding New Data Inputs
• Use the Deployment Methodology as a guide
• What process does a user follow to index new data into Splunk?
• Once data is indexed, what search time changes need to be made?
Are there new knowledge objects that should be created?
• Who determines the ownership of data?
• How are reports created to support the new data?

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 187 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Onboarding New Users
• How is a user or user group given access to Splunk?
• What are the default apps?
• How is access to data controlled? What data do new users see?
• What training do new users need or receive?
• Are there limits on making objects (especially scheduling searches)?
Any quotas?
• Progressive privileges
–  As users are trained or gain advanced skills, give them additional privileges
ê  real time search, create dashboards, etc.

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 188 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Operations Review
• Periodically review scheduled searches
–  Monitor jobs & review search logs to identify the expensive searches
–  Check for inefficiencies and redundancies

• Periodically review acceleration summaries


–  Consider removing summaries with low usage and large size or heavy load

• Review disk consumption


• Perform a security audit on a regular basis

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 189 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Operations – Security
• Check Splunk Security Portal frequently
–  www.splunk.com/page/securityportal

• Important resource is the Securing Splunk manual


–  docs.splunk.com/Documentation/Splunk/latest/Security/
WhatyoucansecurewithSplunk

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 190 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Module 10 Lab Exercise
Time: 5 minutes
• Email your lab document(s) to your instructor

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 191 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Resources and Additional Study
• Splunk documentation for architects
–  Installation Manual (for hardware and capacity planning information)
–  Distributed Deployment Manual
–  Capacity Planning Manual
–  Other manuals for specific topics
• Splunk Answers answers.splunk.com
• Splunk's public wiki (watch dates for outdated topics)
www.splunk.com/wiki/Deploy:Deployment_topics
–  Deployment scenarios and other useful topics

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 192 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Resources and Additional Study
• Splunk documentation for architects
–  Installation Manual (for hardware and capacity planning information)
–  Distributed Deployment Manual
–  Capacity Planning Manual
–  Other manuals for specific topics
• Splunk Answers answers.splunk.com
• Splunk's public wiki (watch dates for outdated topics)
www.splunk.com/wiki/Deploy:Deployment_topics
–  Deployment scenarios, performance info and other useful topics

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 193 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Support Options
• Community Support - Documentation, the Community Wiki,
Splunk Answers, Splunk Blogs, and IRC are available to anyone for
answers to basic technical questions. These resources are always available
to provide you with an immediate answer
• Enterprise Support - Direct access to our Customer Support team by
phone and the ability to manage your cases online
• Global Support - 24x7x365 support for critical issues, a dedicated
resource to manage your account and quarterly review of your
deployments

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 194 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
.conf2016: The 7th Annual
Splunk Worldwide Users’ Conference

•  September 26-29, 2016 •  80+ Customer Speakers


•  The Disney Swan and Dolphin, Orlando •  40+ Apps in Splunk Apps Showcase
•  4400+ IT & Business Professionals •  70+ Technology Partners

•  3 days of technical content •  1:1 networking: Ask The Experts and
•  175+ sessions Security Experts, Birds of a Feather and

Chalk Talks
•  3 days of Splunk University
•  Sept 24-26, 2016 •  NEW hands-on labs!
•  Get Splunk Cer'fied! •  Expanded show floor, Dashboards Control
•  Get CPE credits for CISSP, CAP, SSCP
Room & Clinic, and MORE!

Listen
Visit
Listentotoyour
yourdata.
conf.splunk.com
Generated
data. 195
for
for Igor Abreu (iabreu@realprotect.net) (C) more information
Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Thank You
Complete the Class Evaluation to be in this month's drawing for a $100 Splunk
Store voucher
1.  Look for the invitation email, What did you think of your Splunk Education class, in your inbox
2.  Click the link or go to the specified URL in the email

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 196 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Appendix:
Sample Topologies

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 197 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Deployment – Centralized Topology
Indexers

Search Head Cluster

Auto-load
Balancing

(in outputs.conf on forwarders)

Intermediate
Forwarders Forwarder

Syslog Devices

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 198 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Deployment – Decentralized Topology
Indexers Indexers Search Head

Forwarders

Forwarders

Indexers Intermediate
Forwarder
Search Head Indexers

Search Head Cluster

Forwarder
Network Devices

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 199 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016
Deployment – Hybrid Topology
Indexers
Search Head Indexers

Forwarders

Forwarders

Indexers Intermediate
Forwarder
Search Head Indexers Search Head Cluster

Forwarder
Indexers Network Devices

Generated for Igor Abreu (iabreu@realprotect.net) (C) Splunk Inc, not for distribution
Architecting and Deploying Splunk 6.4
Listen
Listentotoyour
yourdata.
data. 200 Copyright © 2015 Splunk, Inc. All rights reserved | 13 May 2016

You might also like