Professional Documents
Culture Documents
Splunk Notes
Splunk Notes
Definitions
Host
It's the name of the physical or virtual device
where an event originates. It can be used to find all
data originating from a specific device.
Source
It's the name of the file, directory, data stream, or
other input from which a particular event
originates.
Sources are classified into source types, which
can be either well known formats or formats
defined by the user. Some common source types
are HTTP web server logs and Windows event
logs.
Tags
A tag is a knowledge object that enables you
to search for events that contain particular field
values. You can assign one or more tags to any
field/value combination, including event types,
hosts, sources, and source types. Use tags to
group related field values together, or to track
abstract field values such as IP addresses or ID
numbers by giving them more descriptive names.
Indexes
When data is added, Splunk software parses
the data into individual events, extracts the
timestamp, applies line-breaking rules, and
stores the events in an index. You can create new
indexes for different inputs. By default, data is
stored in the “main” index. Events are retrieved
from one or more indexes during a search.
Splunk Forwarder
A Splunk instance that forwards data to another
Splunk instance is referred to as a forwarder.
Splunk Indexer
An indexer is the Splunk instance that indexes
data.
The indexer transforms the raw data into events
and stores the events into an index. The indexer
also searches the indexed data in response to
search requests. The search peers are indexers
that
fulfill search requests from the search head.
Log Monitoring
Continuous Log Monitoring and From
A Log File
We can setup Splunk to import logs from a specific
log file and then set it up to continuously import
the logs from the specified log file. An example
would be constantly monitoring and importing web
server logs from to Splunk from the web server
logs.
9- Click on New.
Operational Notes
Its vastly important to determine the source of
your data to be able to correctly pull the
logs/events. Specify this source as
index=name-of-the-datasource. If you don't
know you can then retrieve all data from all
sources index=* but this will retrieve huge
amounts of events that you don't necessarily
need to parse.
When we are dealing with Windows hosts and
looking at processes starting, we can view
both Windows event logs and Sysmon for
more information
Remember that when searching for
backslashes to add a second backslash,
because a single backslash is an escape
character and will cause your search to not run
correctly.
When we analyze a ransomware attack, one of
the questions we should ask is what
connections are the impacted systems making
to other systems? File shares can be leveraged
to inflict greater damage by encrypting shared
data thus increasing the impact of the
ransomware beyond the individual host.
Registry data from Windows systems as well
as Microsoft Sysmon will provide insight into
file shares
Oftentimes we will need to determine which
destinations an infected system
communicated with. The first time and last
time it occurred are important pieces of
information in your investigation as well
In a host-centric environment, we use
hostnames more frequently than IP addresses.
As we start looking at our events from a
network-centric approach, we need to be
aware that we may need to search by IP as
well
IDS/IPS or Malware Signatures may already
exist for threats that we need to deal with.
Understanding what signatures fired is
important to understand when the threat was
seen, where in the network it was seen, when
it was seen, what technology identified the
signature and the nature of the threat.
Stream is a free app for Splunk that collects
wire data and can focus on a number of
different protocols, including but not limited to
smtp, tcp, ip, http and more
index=dataset domain.com
src=40.80.148.42 sourcetype=suricata
index=botsv1 sourcetype=suricata
(dest="ip1" OR dest_ip="ip2") .exe
index=botsv1 sourcetype=suricata
(dest=domain.com OR dest="ip1")
http.http_method=POST .exe
#TIP
Using the stats command, we can get a count for
each combination of signature and signature_id.
Because the signature and the id are likely unique,
that may not be necessary in this case, but it
provides a good way to see the description and the
associated ID of the signature accompanied by the
count.
Example
index=dataset sourcetype=suricata
alert.signature=*keyword* | stats
count by alert.signature
alert.signature_id | sort count
Scenario 2
This scenarios focuses on the point we mentioned
earlier about extracting IDS alerts.
We can display all IDS alerts
index=dataset sourcetype=suricate |
stats values(alert.signature)
index=dataset sourcetype=suricate
"alert.signature"="ET MALWARE
Win32/Trickbot Data Exfiltration" |
table src, dest
index=dataset domain.com
sourcetype=stream:http
index=* dest=192.168.250.70
sourcetype=stream:http status=200 |
stats count by uri | sort - count
index=* sourcetype=iis
- index=* sourcetype=stream:http
http_method=POST
form_data=*username*passwd*
| table form_data
Using regular expressions to display usernames
and passwords in http requests
index=* sourcetype=stream:http
form_data=*username*passwd*
| rex field=form_data "passwd=(?
<userpassword>\w+)"
| head 10
| table userpassword
index=* sourcetype=stream:http
form_data=*username*passwd*
| rex field=form_data "passwd=(?
<userpassword>\w+)"
| eval lenpword=len(userpassword)
| table userpassword lenpword
Sorting by time
index=* sourcetype=stream:http
form_data=*username*passwd*
dest_ip=192.168.250.70
src=40.80.148.42
| rex field=form_data "passwd=(?
<userpassword>\w+)"
| search userpassword=*
| table _time uri userpassword
[2]
index=* sourcetype=stream:http
| rex field=form_data "passwd=(?
<userpassword>\w+)"
| search userpassword=batman
| transaction userpassword
| table duration
[2]
index=* sourcetype=stream:http
uri=*.exe | stats values(uri)
index="dataset" domain.com
sourcetype="stream:http"
http_method=GET
index="dataset" ip
sourcetype="stream:HTTP" NOT
(site=*.microsoft.com OR
site=*.bing.com OR site=*.windows.com
OR site=*.atwola.com OR site=*.msn.com
OR site=*.adnxs.com OR
site=*.symcd.com OR
site=*.folha.com.br)
| dedup site
| table site
Displaying URL values between a client and a
server. This is useful if you are looking for patterns
of web application attacks or OWASP attacks such
as SQL injection, IDOR, File inclusion or directory
traversal. All these attacks happen over URLs
index=dataset domain.com
sourcetype=stream* | stats
count(src_ip) as Requests by src_ip |
sort - Requests
index="dataset" host="MACLORY-AIR13"
(*.ppt OR *.pptx)
index=dataset name.exe
sourcetype=XmlWinEventLog:Microsoft-
Windows-Sysmon/Operational
index=dataset name.exe
CommandLine=name.exe
| stats values(MD5)
index=botsv1 hostname
sourcetype=XmlWinEventLog:Microsoft-
Windows-Sysmon/Operational
index=dataset sourcetype=fgt_utm
"192.168.250.70" category="Malicious
Websites"
USB attacks
USB records can be found in the windows registry.
You can also use somekeywords if you have info
about the USB
index=dataset sourcetype=winregistry
keyword
index=dataset sourcetype=winregistry
keyword | table host object data
index=dataset
sourcetype=XmlWinEventLog:Microsoft-
Windows-Sysmon/Operational
host=targetpcname "d:\\" | reverse
index=dataset
sourcetype=XmlWinEventLog:Microsoft-
Windows-Sysmon/Operational
host=targetpcname
(CommandLine="*d:\\*" OR
ParentCommandLine="*d:\\*") | table
_time CommandLine ParentCommandLine |
sort _time
Finding the vendor name of a USB inserted into a
host
File sharing
File sharing events can be found in sysmon
index=dataset
sourcetype=XmlWinEventLog:Microsoft-
Windows-Sysmon/Operational
host=hostname
index=dataset
sourcetype=XmlWinEventLog:Microsoft-
Windows-Sysmon/Operational
host=hostname src="filserver.com"
By adding the stats command we can quickly see
the number of network connections the host has
made.
index=dataset
sourcetype=XmlWinEventLog:Microsoft-
Windows-Sysmon/Operational
host=hostname src="filserver.com" |
stats count by dest_ip | sort - count
index=dataset
sourcetype="XmlWinEventLog:Microsoft-
Windows-Sysmon/Operational" ip
Parsing DNS
Scenario[1]
Its useful to find C2C domains when investigating a
compromised network or machine.
By using conditions like AND OR and NOT to
narrow down search criteria, we can quickly
exclude domains that we don't have an interest in.
Keep in mind AND is implied between conditions
index=dataset sourcetype=stream:DNS
src=192.168.250.100 record_type=A NOT
(query{}=*.microsoft.com OR
query{}=*.bing.com OR
query{}=*.bing.com OR
query{}=*.windows.com OR
query{}=*.msftncsi.com) | stats count
by query{} | sort - 10 count
By tabling the result set and then using the reverse
command we get the earliest time and query at the
top of the list.
index=dataset sourcetype=stream:DNS
src=192.168.250.100 record_type=A NOT
(query{}=*.microsoft.com OR
query{}=*.waynecorpinc.local OR
query{}=*.bing.com OR query{}=isatap
OR query{}=wpad OR
query{}=*.windows.com OR
query{}=*.msftncsi.com) | table _time
query{} src dest | reverse
#TIP
To understand the relationship between processes
started and their parent/children, we can use the
table command to see the time along with the
process that was executed, its associated parent
process command, the process ID and parent
process ID. Using the reverse command moves the
first occurrence to the top. The IDs provide the
linkage between these different processes. While
Sysmon can show the immediate parent process
that spawned another process in a single event, it
cannot show an entire chain of processes being
created.
Scenario [2]
Sometimes we want statistical information about
the DNS connections. One of which is the round
trip time or RTT. This metric measures the time
spent from when the DNS query was sent until the
answers is received.
Say we want to calculate RTT sent to a specific
destination IP
index=* sourcetype=dns
dest=10.10.10.10 | stats avg(rtt) as
avgrtt | eval avgrtt=round(avgrtt,5)
index=* sourcetype=dns
dest=10.10.10.10 | bucket _time
span=1m | stats avg(rtt) by _time
Email Activity
Examining smtp events with keywords such as
email address and domain name
index="dataset" sourcetype=stream:smtp
email@email.com domain.com
index="dataset"
sourcetype="stream:smtp" | table
mailfrom, subject
index="dataset"
sourcetype="stream:smtp"
mailfrom=test@test.com
FTP events
Looking through downloaded files
index="botsv2" sourcetype="stream:ftp"
method=RETR
| reverse
AWS Events
Listing out the IAM users that accessed an AWS
service (successfully or unsuccessfully)
index="dataset"
sourcetype="aws:cloudtrail" IAMUser
| dedup user
| table user
index="dataset"
sourcetype="aws:cloudtrail" NOT
tag::eventtype="authentication"
"userIdentity.sessionContext.attribute
s.mfaAuthenticated"=false
Looking through events related to n S3 bucket
publicly accessible.
index="dataset"
sourcetype="aws:cloudtrail"
eventType=AwsApiCall
eventName=PutBucketAcl
| reverse
index="dataset"
sourcetype="aws:s3:accesslogs" *PUT* |
reverse |
index="dataset"
sourcetype="aws:cloudtrail"
user_type="IAMUser"
errorCode!="success"
eventSource="iam.amazonaws.com"
| stats dc(errorMessage) by
userIdentity.accessKeyId
sourcetype="aws:cloudtrail"
userIdentity.accessKeyId="AKIAJOGCDXJ5
NW5PXUPA"
eventName="DescribeAccountAttributes"
[2]
index="dataset"
sourcetype="symantec:*" *coin*
O365 Events
Looking through file upload events to onedrive
index="dataset"
sourcetype="ms:o365:management"
Workload=OneDrive
Operation=FileUploaded
| table _time src_ip user object
UserAgent
WIN event logs
Finding antivirus alerts. In the example below we
used symantec antivirus. Useful to find malicious
exes
index="dataset"
source="WinEventLog:Application"
SourceName="Symantec AntiVirus"
*Frothly*
index="dataset"
source="wineventlog:security"
EventCode=4720
index="dataset"
source="wineventlog:security" svcvnc
"EventCode=4732"
Listing URLs accessed through [powershell]
index="dataset"
source="WinEventLog:Microsoft-Windows-
PowerShell/Operational" Message="*/*"
| rex field=Message "\$t\=[\'\"](?
<url>[^\'\"]+)"
| table url
Linux events
Looking through users added along with their
passwords
Osuery
Looking through users added along with their
passwords [Linux]
index="botsv3"
sourcetype="osquery:results" useradd
index="dataset" 1337
sourcetype="osquery:results"
"columns.port"=1337
SSH Events
When parsing SSH events, we mainly aim to detect
failed SSH logins which indicate possible brute
force attacks or successful SSH logins for the
purpose of auditing.
Detecting PrintNightmare
vulnerability
Identifies Print Spooler adding a new
Printer Driver
source="WinEventLog:Microsoft-Windows-
PrintService/Operational"
EventCode=316 category = "Adding a
printer driver" Message =
"*kernelbase.dll,*" Message =
"*UNIDRV.DLL,*" Message = "*.DLL.*"
| stats count min(_time) as firstTime
max(_time) as lastTime by OpCode
EventCode ComputerName Message
Detects spoolsv.exe with a child
process of rundll32.exe
source="WinEventLog:Microsoft-Windows-
PrintService/Admin"
((ErrorCode="0x45A" (EventCode="808"
OR EventCode="4909"))
OR ("The print spooler failed to load
a plug-in module" OR
"\\drivers\\x64\\"))
| stats count min(_time) as
firstTime max(_time) as lastTime by
OpCode EventCode ComputerName Message
Case Function
Takes pairs of arguments X and Y, where X
arguments are Boolean expressions. When
evaluated to TRUE, the arguments return the
corresponding Y argument
case(X,"Y",…)
#Example
case(id == 0, "Amy", id == 1,"Brad", id ==
2, "Chris")
Ceiling of a number
ceil(X)
IP Address Identification
#Syntax
cidrmatch("X",Y)
Identifies IP addresses that belong to a particular
subnet
#example
cidrmatch("133.155.22.0/25",ip)
Coalese Function
#Syntax
coalesce(X,…)
The first value that is not NULL
#example
coalesce(null(), "Returned val", null())
Cosine Function
#Syntax
cos(X)
Exact Function
#Syntax
exact(X)
Evaluates an expression X using double precision
floating point arithmetic
exact(3.14+num)
Exp Function
#Syntax
exp(X)
e (natural number) to the power X (eX)
exp(3)
IF Function
#Syntax
if(X,Y,Z)
If X evaluates to TRUE, the result is the second
argument Y. If X evaluates to FALSE, the result
evaluates to the third argument Z
In Function
#Syntax
in(field,valuelist) TRUE if a value in valuelist
matches a value in field. You must use the in()
function embedded inside the if() function
`if(in(status, "404","500","503"),"true","false")
Boolean Functions
isbool(X)
TRUE if X is Boolean
isbool(field)
isint(X)
TRUE if X is an integer
isint(field)
isnull(X)
TRUE if X is NULL
isnull(field)
isstr(X)
TRUE if X is a string
isstr(field)
Len Function
#Syntax
len(X)
Character length of string X
len(field)
Like Function
#Syntax
like(X,"Y")
TRUE if and only if X is like the SQLite pattern in Y
like(field, "addr%")
Logarithm Function
#Syntax
log(X,Y)
Logarithm of the first argument X where the
second argument Y is the base. Y defaults to 10
(base-10 logarithm)
log(number,2)
Lower Function
#Syntax
lower(X)
Lowercase of string X
lower(username)
Ltrim Function
#Syntax
ltrim(X,Y)
X with the characters in Y trimmed from the left
side. Y defaults to spaces and tabs
Match Function
#Example
match(X,Y)
TRUE if X matches the regular expression pattern Y
match(field, "^\d{1,3}\.\d$")
Max Function
#Syntax
max(X,…)
The maximum value in a series of data X,…
max(delay, mydelay)
Hashing
#Syntax
md5(X)
MD5 hash of a string value X
md5(field)
Min Function
#Syntax
min(X,…)
The minimum value in a series of data X,…
min(delay, mydelay)
Mv Count Function
#Syntax
mvcount(X)
Number of values of X
mvcount(multifield)
Mv Filter Function
#Syntax
mvfilter(X)
Filters a multi-valued field based on the Boolean
expression X
mvfilter(match(email, "net$"))
mvindex Function
#Syntax
mvindex(X,Y,Z) Returns a subset of the multi-
valued field X from start position (zero-based) Y to
Z (optional)
mvindex(multifield, 2)
mvjoin Function
#Syntax
mvjoin(X,Y)
Joins the individual values of a multi-valued field X
using string delimiter Y
mvjoin(address, ";")
now() Function
#Syntax
now()
Current time as Unix timestamp
now()
null() Function
#Syntax
null()
NULL value. This function takes no arguments.
null()
nullif Function
#Syntax
nullif(X,Y)
X if the two arguments, fields X and Y, are
different. Otherwise returns NULL.
nullif(fieldX, fieldY)
random() Function
#Syntax
random()
Pseudo-random number ranging from 0 to
2147483647
random()
relative_time Function
#Syntax
relative_time (X,Y)
Unix timestamp value of relative time specifier Y
applied to Unix timestamp X
relative_time(now(),"-1d@d")
replace Function
#Syntax
replace(X,Y,Z)
A string formed by substituting string Z for every
occurrence of regex string Y in string X
The example swaps the month and day numbers of
a date.
#Example
`replace(date, "^(\d{1,2})/(\d{1,2})/", "\2/\1/")
round Function
#Syntax
round(X,Y)
X rounded to the number of decimal places
specified by Y, or to an integer for omitted Y
round(3.5)
rtrim Function
#Syntax
rtrim(X,Y)
X with the characters in (optional) Y trimmed from
the right side. Trim spaces and tabs for unspecified
Y.
split Function
#Syntax
split(X,"Y")
X as a multi-valued field, split by delimiter Y.
split(address, ";")
sqrt Function
#Syntax
sqrt(X)
Square root of X
sqrt(9) # 3
strftime Function
#Syntax
strftime(X,Y)
Unix timestamp value X rendered using the format
specified by Y
strftime(time, "%H:%M")
strptime Function
#Syntax
strptime(X,Y)
Value of Unix timestamp X as a string parsed from
format Y
strptime(timeStr, "%H:%M")
substr Function
#Syntax
substr(X,Y,Z)
Substring of X from start position (1-based) Y for
(optional) Z characters
substr("string", 1, 3)
time Function
#Syntax
time()
Current time to the microsecond.
time()
tonumber Function
#Syntax
tonumber(X,Y)
Converts input string X to a number of numerical
base Y (optional, defaults to 10)
tonumber("FF",16)
tostring Function
#Syntax
tostring(X,Y)
Field value of X as a string. If X is a number, it
reformats it as a string. If X is a Boolean value, it
reformats to "True" or "False" strings. If X is a
number, the optional second argument Y is one
of:"hex": convert X to hexadecimal,"commas":
formats X with commas and two decimal places,
or"duration": converts seconds X to readable time
format HH:MM:SS.
tostring(500, "duration")
typeof Function
#Syntax
typeof(X)
String representation of the field type
urldecode Function
#Syntax
urldecode(X) URL X, decoded.
urldecode("http%3A%2F%2Fwww.site.com%2Fvie
w%3Fr%3Dabout")
validate Function
#Syntax
validate(X,Y,…)
For pairs of Boolean expressions X and strings Y,
returns the string Y corresponding to the first
expression X which evaluates to False, and
defaults to NULL if all X are True.
splunk stop
splunk enable webserver
splunk start
License Management
Show current licenses
User Management
Search for all users who are admins
splunk _internal call
/services/authentication/users -
get:search admin