You are on page 1of 2

"This website is not affiliated with Splunk, Inc.

and is not an Search

authorized seller of Splunk products or services."

Home - Splunk Tutorial Splunk training videos Splunk interview questions Contact US

About Us Privacy Policy Splunk Jobs

<
Back |Home| Next >

Tips for optimizing queries in splunk for better performance :

- Optimizing queries in Splunk’s Search Processing Language is similar to optimizing queries in SQL. The two core tenants
are the same:
1. Change the physics (do something different)
2. Reduce the amount of work done (optimize the pipeline) In a distributed environment suck as Splunk, Hadoop,
Elastic Search, etc. you can add to this
3.Distribute the work as much as possible Reduce the amount of data being moved

​- Time Range
In Splunk, data is organized by time into buckets. Reducing the time span being searched directly reduces the number of
buckets that need to be processed. So the first task when optimizing a server is to look for searches that are not limited by
time. This alone can result in an improvement of 30x to 365x.

- Indexes
Indexed fields in Splunk control where the data will be physically stored on the disk. So just like searches without a time
range, searches without an “index=” clause will require physically reading far more files than you may actually need.
Correcting this typically gives a 2x to 10x improvement.

- Search Modes
There are three search modes in Splunk: smart, fast, and verbose. Verbose mode pulls back far more data than the other

open in browser PRO version Are you a developer? Try out the HTML to PDF API pdfcrowd.com
modes, usually resulting in a 2x to 5x penalty. So only use it when you need to diagnose a query.
Inclusionary Search Terms
Indexes in Splunk are designed to work best with inclusive filters. Say, for example, you have a field that can only be A, B,
C, or D. You can see a significant improvement if you convert the exclusive filter “not (field = D)” into an inclusive filter such
as “(field = A) OR (field = B) OR (field = C)”.

- Indexed Extractions
Because it works on unstructured data, Splunk does a lot of work with regular expressions. Known as “extractions”, this can
be done during the search but it is expensive. So consider indexed extracted fields just as you would index a computed
column in a relational database.
For either mode, there are ways to reduce the cost of regular expression processing
Backtracking is very expensive Prefer + to * because the zero part of “zero or more” can lead to backtracking When they
appear and are used together, extract multiple fields with one expression Keep your regular expressions as simple as
possible Since those last two recommendations are often in conflict, you should test both ways.

-Avoid Joins
Joins in Splunk are incredibly expensive. They often involve creating a subsearch that brings back all of the data from the
indexers into the search head prior to filtering. As you can imagine, this can be quite expensive.
Usually you can replace the join with a “stats values(…)” clause that eagerly filters the data, but those techniques are beyond
the scope of this article.
< Back |Home| Next >
Comments

Name

Enter your comment here

Comment by Html Comment Box

No one has commented yet. Be the first!

Learn | Explore | Evolve

open in browser PRO version Are you a developer? Try out the HTML to PDF API pdfcrowd.com

You might also like