You are on page 1of 117

Splunk Knowledge Manager Manual

Version: 4.1.7
Generated: 2/16/2011 03:57 pm Copyright Splunk, Inc. All Rights Reserved

Table of Contents
Welcome to knowledge management................................................................................................1 What is Splunk knowledge?.......................................................................................................1 Why manage Splunk knowledge?..............................................................................................2 Prerequisites for knowledge management.................................................................................4 Organize and administrate knowledge objects .................................................................................6 Curate Splunk knowledge with Manager ....................................................................................6 Develop naming conventions for knowledge objects...............................................................13 Understand and use the Common Information Model.............................................................14 . Data interpretation: Fields and field extractions .............................................................................25 About fields..............................................................................................................................25 Overview of search-time field extraction..................................................................................26 Use the Field extractions page in Manager ..............................................................................29 Use the Field transformations page in Manager......................................................................34 Create and maintain search-time field extractions through configuration files.........................39 Configure multivalue fields.......................................................................................................52 Data classification: Event types and transactions.........................................................................54 About event types....................................................................................................................54 Define and maintain event types in Splunk Web ......................................................................56 Configure event types directly in eventtypes.conf....................................................................60 Configure event type templates...............................................................................................61 About transactions...................................................................................................................62 Search for transactions............................................................................................................63 Define transactions..................................................................................................................65 Data enrichment: Lookups and workflow actions..........................................................................68 About lookups and workflow actions........................................................................................68 Look up fields from external data sources ................................................................................69 Create workflow actions in Splunk Web...................................................................................77 Configure workflow actions through workflow_actions.conf.....................................................85 Data normalization: Tags and aliases..............................................................................................86 About tags and aliases.............................................................................................................86 Define and manage tags..........................................................................................................86 Create aliases for fields ............................................................................................................90 Tag the host field ......................................................................................................................92 Tag event types ........................................................................................................................93 Manage your search knowledge .......................................................................................................94 Manage saved searches..........................................................................................................94 Configure the priority of scheduled searches...........................................................................94 Design macro searches...........................................................................................................96 Design form searches..............................................................................................................96
i

Table of Contents
Manage your search knowledge Define navigation to saved searches and reports....................................................................97 Set up and use summary indexes....................................................................................................99 Use summary indexing for increased reporting efficiency ........................................................99 Manage summary index gaps and overlaps ...........................................................................105 Configure summary indexes ...................................................................................................109

ii

additional categories of Splunk knowledge objects are created. field extractions. The Knowledge Manager manual goes into more depth. you can then use them to perform analytical searches on precisely-defined subgroups of events. When you use Splunk you do more than just look at individual entries in your log files.Lookups and workflow actions are categories of knowledge objects that extend the usefulness of your data in various ways. you know that it covers Splunk knowledge basics in its "Capture knowledge" chapter. But the bulk of this information is created at "search time. For example. Event types group together sets of events discovered through searches. clarifying what can at first glance seem incomprehensible. such as a WHOIS lookup on a field 1 . workflow actions.Fields and field extractions make up the first order of Splunk knowledge. tags. As your organization uses Splunk. It shows you how to maintain sets of knowledge objects for your organization (through Manager and configuration files) and demonstrates ways that Splunk knowledge can be used to solve your organization's real-world problems. Workflow actions enable interactions between fields in your data and other applications or web resources.Welcome to knowledge management What is Splunk knowledge? What is Splunk knowledge? Splunk is a powerful search and analysis engine that helps you see both the details and the larger patterns in your IT data.You use event types and transactions to group together interesting sets of similar events. • Data enrichment: Lookups and workflow actions . Splunk enables you to dynamically extract knowledge from raw data as you need it. • Data classification: Event types and transactions . including event types. event types enable you to quickly and easily classify and group together similar events. you leverage the information they hold collectively to find out more about your IT environment. Splunk automatically extracts different kinds of knowledge from your IT data--events. as Splunk indexes your IT data. more focused way. If you've read the User manual. while transactions are collections of conceptually-related events that span time. lookups. timestamps. You can think of Splunk knowledge as a multitool that you use to discover and analyze various aspects of your IT data. Splunk knowledge is grouped into five categories: • Data interpretation: Fields and field extractions . Unlike databases or schema-based analytical tools that decide what information to pull out or analyze beforehand. Some of this information is extracted at index time. Field lookups enable you to add fields to your data from external data sources such as static tables (CSV files) or Python-based commands. The fields that Splunk automatically extracts from your IT data help bring meaning to your raw data. The fields that you extract manually expand and improve upon this layer of meaning. smarter. and so on--to help you harness that information in a better. and saved searches." both by Splunk and its users. fields.

Why manage Splunk knowledge? Why manage Splunk knowledge? If you have to maintain a fairly large number of knowledge objects across your Splunk deployment. The benefits that knowledge managers can provide include: 2 . When you leave a situation like this unchecked. The Knowledge Manager manual also includes a chapter on summary indexing. and even more so if you have several teams of users working with Splunk.containing an IP address. the third topic in this chapter. and thoughtful saved search organization ensures that they are discoverable by those that need them. For example. your users may find themselves sorting through large sets of objects with misleading or conflicting names. • Saved searches . Make a PDF If you'd like a PDF of any version of this manual. you can group events from set of hosts in a particular location (such as a building or city) together--just give each host the same tag.Tags and aliases are used to manage and normalize sets of field information. struggling to find and use objects that have unevenly applied app assignments and permissions. see "Why manage Splunk knowledge?". Splunk managers provide centralized oversight of the Splunk knowledge. for example). and you can save it or print it out to read later. you know that management of that knowledge is important. and to give extracted fields tags that reflect different aspects of their identity. the next topic in this chapter. can be turned into reusable search macros. see Prerequisites for knowledge management. For more information. Or maybe you have two different sources using different field names to refer to same data--you can normalize your data by using aliases (by aliasing clientip to ipaddress.Saved searches are another category of Splunk knowledge. click the pdf version link above the table of contents bar on the left side of this page. A PDF version of the manual is generated on the fly for you. You can use tags and aliases to group sets of related field values together. Vast numbers of saved searches can be created by Splunk users within an organization. At this point you may be asking the question "Why does Splunk knowledge need to be 'managed' anyway?" For answers. There are also advanced uses for saved searches: they are often used in dashboards. Summary index setup and oversight is an advanced practice that can benefit from being handled by users in a knowledge management role. Knowledge managers should have at least a basic understanding of data input setup. and more. This is simply because a greater proliferation of users leads to a greater proliferation of additional Splunk knowledge. This is especially true of organizations that have a large number of Splunk users. and wasting precious time creating objects such as saved searches and field extractions that already exist elsewhere in the system. and indexing concepts. event processing. • Data normalization: Tags and aliases .

1. • Review of summary index setup and usage. 3 . Note: As of Release 4. Summary indexes may be used by many teams across your deployment to run efficient searches on large volumes of data. You don't have to be a Splunk app designer to ensure that users can quickly and easily navigate to the searches. see "Develop naming conventions for knowledge objects" in this manual. but as these redundant knowledge objects stack up.• Oversight of knowledge object creation and usage across teams. and deployments. well-used Splunk implementation to end up with a dozen tags that all have been to the same field. the navigation for saved searches. views. It's easy for any robust. For more information. reports. Knowledge managers can mitigate these situations by monitoring object creation and ensuring that useful "general purpose" objects are shared on a global basis across deployments. ensuring that they are built correctly. Although Splunk is based on data indexes. views. There are certain aspects of knowledge object setup that are best handled through configuration files. not databases. • Normalization of event data. but their usage also counts against your overall license volume. The knowledge manager can provide centralized oversight of summary index usage across your organization. departments. as well as views and dashboards. and dashboards they need to do their job efficiently. Left unmoderated. and dashboards can become very confusing as more and more of these kinds of objects are added to Splunk applications. and are shared as appropriate with users throughout your Splunk deployment. see "Define navigation for saved searches and reports" in this manual. For more information. This manual will show you how to work with knowledge objects this way. the end result is confusion and inefficiency on the part of its users. summary index usage does not count against your overall license volume. We'll provide you with some tips about normalizing your knowledge object libraries by applying uniform naming standards and using Splunk's Common Information Model. To put it plainly: knowledge objects proliferate. See "Create search time field extractions" in this manual as an example of how you can manage Splunk knowledge through configuration files. the basic principles of normalization still apply. you'll eventually find teams "reinventing the wheel" by designing objects that were already developed by other teams. For more information. • Setup and organization of app-level navigation for saved searches and reports. True knowledge management experts know how and when to leverage the power of Splunk's configuration files when it comes to the administration of Splunk knowledge. used responsibly. • Management of knowledge objects through configuration files. reports. see "Curate Splunk knowledge with Manager" in this manual. If you have a large Splunk deployment spread across several teams of users.

. Here are some of the "admin" topics that knowledge managers should be familiar with. Pay special attention to "Index time vs search time". and "Manage app objects".For more information. as this can affect your knowledge management strategy. "App architecture and object ownership". we do recommend that all knowledge managers have a good understanding of these "Splunk admin" concepts. adjusting event processing activities. Prerequisites for knowledge management Prerequisites for knowledge management Most knowledge management tasks are centered around "search time" event manipulation. This knowledge can help you troubleshoot problems with your event data and recognize "index time" event processing issues. 4 . setting up forwarding and receiving. such as setting up data inputs. it's a good idea to get a handle on how they've been implemented.. you should get some background on how they're organized and how app object management works within multi-app deployments. • Getting event data into Splunk: It's important to have at least a baseline understanding of Splunk data inputs. In other words. see "Use summary indexing for increased reporting efficiency" in this manual. correcting default field extraction issues. a typical knowledge manager usually doesn't focus their attention on work that takes place before events are indexed. Start with "Overview of event processing" and read the entire chapter.and it helps them troubleshoot issues that will inevitably come up over time. A solid grounding in these subjects enables knowledge managers to better plan out their approach towards management of knowledge objects for their deployment. creating and maintaining indexes. Get an overview of the subject at "About forwarding and receiving". See "What's an app?". • Indexing with Splunk: What is an index and how does it work? What is the difference between "index time" and "search time" and why is this distinction significant? Start with "What's a Splunk index?" and read the rest of the chapter. • Understand event processing: It's a good idea to get a good grounding in the steps that Splunk goes through to "parse" data before it indexes it. • Understand your forwarding and receiving setup: If your Splunk deployment utilizes forwarders and receivers. • Configuration file management: Where are Splunk's configuration files? How are they organized? How do configuration files take precedence over each other? See "About configuration files" and "Configuration file precedence". Check out "What Splunk can monitor" and read the other topics in this chapter as necessary. However. with Admin manual links to get you started: • Working with Splunk apps: If your deployment uses more than one Splunk app. and so on.

as this directly affects your efforts to share and promote knowledge objects between groups of users. start with "About users and roles" and read the rest of the chapter as necessary. 5 . most of the time you'll concern yourself with search-time field extraction. As a knowledge manager. which get extracted at index-time. it's a good idea to understand how they're set up within your deployment.• Default field extraction: Most field extraction takes place at search time. source. but it's a good idea to know how default field extraction can be managed when it's absolutely necessary to do so. However. For more information. Start with "About default fields". and sourcetype fields that Splunk applies to each event. This can help you troubleshoot issues with the host. • Managing users and roles: Knowledge managers typically do not directly set up users and roles. with the exception of certain default fields.

6 . roles. and that "bad" objects are removed before they develop lots of downstream dependencies. This is especially true for the Field extractions and Field transformations pages in Manager. and so on. Tags are added to fields. knowledge is added to the base set of event data indexed within it. with an eye towards reducing redundancy. Lookups and workflow actions are engineered. We do recommend having some familiarity with configuration files. or delete knowledge objects. especially as they accumulate over time. • Delete knowledge objects that do not have significant "downstream" dependencies. Now they can use Manager. Note: This topic assumes that as a knowledge manager you have an admin role or a role with an equivalent permission set. These things may not be a big issue if your user base is small. • Ensure that knowledge objects with relevancy beyond a particular working team." creating searches that already exist. • Review knowledge objects as they are created. Using configuration files instead of Manager In previous releases Splunk users edited Splunk's configuration files directly to add. and (to some degree) how they are being used. but they can cause unnecessary confusion and repetition of effort. you can easily: • Create knowledge objects as necessary. Searches are saved and scheduled. or app are made available to other teams. Event types and transactions that group together sets of events are defined. The process of knowledge object creation starts out slow. designing redundant event types. but can get complicated over time. This topic discusses how knowledge managers can use Splunk Manager to take charge of the knowledge objects in their Splunk system and show them who's boss. which provides a user-friendly interface with those very same configuration files. Splunk Manager can give a savvy and attentive knowledge manager a view into what knowledge objects are being created. and users of other apps. role. It's easy to reach a point where users are "reinventing the wheel.Organize and administrate knowledge objects Curate Splunk knowledge with Manager Curate Splunk knowledge with Manager As your organization uses Splunk. With Manager. The reasons for this include: • Some Manager functionality makes more sense if you understand how things work at the configuration file level. either "from scratch" or through object cloning. • Functionality exists for certain knowledge object types that isn't (or isn't yet) expressed in the Manager UI. update. ensuring that naming standards are followed. who they're being created by.

you'll find that the Knowledge Manager manual includes instructions for handling various knowledge object types via configuration files.• Bulk deletion of obsolete. For example. Go to Manager > Tags > List by tag name. which are used to perform searches on clusters of field/value pairings. To that end. or improperly defined knowledge objects is only possible with configuration files.spec and . we want to make sure you can use them when you find it necessary to do so. Look for tags with similar or duplicate names that belong to the same app (or which have been promoted to global availability for all users). It can easily be adapted for other types of knowledge objects handled through Manager. This can lead to considerable confusion and frustration. Monitor and organize knowledge objects As a knowledge manager. redundant. For example. you might find a set of tags like 7 . For more information.Keeping tags straight Most healthy Splunk implementations end up with a lot of tags. Here's a procedure you can follow for curating tags. see the following topics in the Admin manual: • About configuration files • Configuration file precedence You can find examples of the current configuration . if you're a long-time Splunk user. For general information about configuration files in Splunk. it's easy to end up with tags that have similar names but which produce surprisingly dissimilar results. however. Other users just prefer the level of granularity and control that configuration files can provide. Example . 2. Wherever you stand with Splunk's configuration files. • You may find that you prefer to work directly with configuration files. 1. brought up on our configuration file system.example files in the "Configuration file reference" chapter of the Admin manual. you should periodically check up on the knowledge object collections in your Splunk implementation. see the documentation of those types. You should be on the lookout for knowledge objects that: • Fail to adhere to naming standards • Are duplicates/redundant • Are worthy of being shared with wider audiences • Should be disabled or deleted due to obsolescence or poor design Regular inspection of the knowledge objects in your system will help you detect anomalies that could become problems later on. it may be the medium in which you've grown accustomed to dealing with knowledge objects. Over time.

As with all aspects of knowledge management you'll want to carefully consider the implications of these access restrictions and expansions. If the tag is used in saved searches. You can: • Make the knowledge object available globally to users of all apps (also referred to as "promoting" an object). Tags are case-sensitive. Alternatively. which you can take advantage of if your permissions enable you to do so. Keep in mind that you may find legitimate tag duplications if you have the App context set to All. an authentication tag for the Windows app will have to be associated with an entirely different set of field/value pairs than an authentication for the UNIX app. 4. To make that object available to more people. However. within a specific app. In some cases you'll determine that certain specialized knowledge objects should only be used by people in a particular role. it is only available to that user. 3. see "Disable or delete knowledge objects. Manager provides the following options. where one tag is linked to an entirely different set of field/value pairs than the other. so Splunk sees them as two separate knowledge objects. • Make the knowledge object available to all users of an app. If you create a replacement tag with a new. where tags belonging to different apps have the same name. transaction. if your permissions enable you to do so. you can set knowledge object permissions to restrict or expand access to the variety of knowledge objects within your Splunk implementation. those objects will cease to function once the tag is removed or disabled. For more information. event type. see "Develop naming conventions for knowledge objects" in this manual. 8 . be aware that there may be objects dependent on it that will be affected. And in others you'll move to the other side of the scale and make universally useful knowledge objects globally available to all users in all apps. Using naming conventions to head off object nomenclature issues If you set up naming conventions for your knowledge objects early in your implementation of Splunk you can avoid some of the thornier object naming issues. For more information. ensure that it is connected to the same field/value pairs as the tag that you are replacing. for example. Share and promote knowledge objects As a Knowledge Manager. When a Splunk user first creates a new saved search. This can also happen if the object belongs to one app context. you may encounter tags with identical names except for the use of capital letters.authentication and authentications in the same app. other event types." below. or similar knowledge object. Try to disable or delete the duplicate or obsolete tags you find. more unique name. dashboard searches. as in crash and Crash. and you attempt to move it to another app context. or transactions. This is often permissible--after all.

People using Splunk in other app contexts remain blissfully ignorant of the event type. the knowledge manager. who implementation of Splunk can use the using the very handy promptly promotes the firewallbreach event type. Users of other Splunk apps in the same Splunk implementation have no idea it exists. everyone that uses this who have grown used to knowledge manager. Mary restricts the ability to edit Users of the Network Security app can the event type to the Firewall use the firewallbreach event type in transactions. Note: You may want to set your Splunk implementation up so that only people with Admin-level roles can share and promote knowledge objects. realizes that only users in the Firewall Manager role should have the ability to edit or update the firewallbreach event type. but now the only people that can edit the knowledge object are those with the Firewall Manager role and people with admin level permissions (such as the knowledge manager). and so on. At some point a few people They make their case to the Now. dashboards. Other users cannot see it or work with it. a user of the (fictional) Network Security app with a "Firewall Manager" role. imagine that Bob. to be in. Manager role. creates a new event type named firewallbreach. searches. Firewall Manager role. Result Anyone using Splunk in the Network Security app context can see. Permissions . A bit later on.• Restrict (or expand) access to global or app-specific objects by user or role. • Set read/write permissions at the app level for roles. and edit the firewallbreach event type. But the ability to update the app decide they'd like to use event type definition is still confined to it in the context of the admin-level users and users with the Windows app as well. which finds events that indicate firewall breaches. work with. Bob decides he wants to share it with his fellow Network Security app users. Action Bob updates the permissions of the firewallbreach event type so that it is available to all users of the Network Security app. Mary. to enable users to share or delete objects they do not own. This would make you (and your fellow knowledge managers) gatekeepers with approval capability over the sharing of new knowledge objects. follow these steps: 9 . it is only available to him. How do permissions affect knowledge object usage? To illustrate how these choices can affect usage of a knowledge object. He also sets up the new event type so that all Network Security users can edit its definition.Getting started To change the permissions for a knowledge object. and the actions and results that would follow: Issue When Bob first creates firewallbreach. regardless of role. Here's a series of permissions-related issues that could come up. no firewallbreach event firewallbreach event type matter what app context they happen type in the Network Security to global availability.

they can see it in the top level navigation (the "Searches & Reports" dropdown. 3. select All apps. for example) and they can run it. In Manager. • If the knowledge object is private. Navigate to the Permissions page for the knowledge object (following the instructions above). when users only have Read permission for a particular saved search. and update the defining details of an object as necessary. navigate to the page for the type of knowledge object that you want to update permissions for (such as Searches and reports or Event types). Save the permission change. then all you have to do is navigate to the Permissions page for that object and select This app under [Knowledge object type] should appear in:. select a permission of either Read or Write: • Read enables users to see and use the object. In other words. 2. Keep in mind. 2. But they can't update the the search string. • Write enables users to view. that switching the app context of an knowledge object can have downstream consequences for objects that have been associated with it.1. Find the knowledge object that you created (use the filtering fields at the top of the page if necessary) and click its Permissions link. and save their changes. click the Move link (it will only appear if you have sufficient permissions to move it). click the App dropdown in the upper right-hand corner of the screen and select the app to which you'd like to restrict the knowledge object. Then select a permission of either Read or Write for Everyone as appropriate. below. use. This will enable you to quickly and easily choose another app context for the knowledge object. however. or shared globally. for Everyone. perform the actions in the following subsections depending on how you'd like to change the object's permissions. On the Permissions page for the knowledge object in question. • If usage of a knowledge object is already restricted to an app and you want to switch its context to another app. change its time range. but not update its definition. Make an object available to users of all apps To make an object globally available to users of all apps in your Splunk implementation: 1. 4. Under [Knowledge object type] should appear in:. Make an object available to users of a particular app To restrict the usage of a knowledge object to a specific app. In the Permissions section. you first have to be in the context of that app. • If neither Read or Write is selected then users cannot see or use the knowledge object. For more information see "Disable or delete knowledge objects". 3. 10 . To do this.

If you want members of a role to: • Be able to use the object and update its definition. give that role Read access only (and make sure that Write is unchecked for the Everyone role). A note about deleting users and roles with unshared objects If a Splunk user leaves your team and you need to delete that user or role from the Splunk system. and they will not find any results when they search on it. leave Read and Write unchecked for that role (and unchecked for the Everyone role as well). share them at the app or global level before deleting the user or role. If you want to keep those knowledge objects. Disable or delete knowledge objects Let's start off by saying that Manager makes it fairly easy to disable or delete knowledge objects as long as your permissions enable you to do so. • Be able to use the object but be unable to update it. • Be unable to see or use the knowledge object at all. give that role Read and Write access. the object will not show up for them in Manager. simply navigate to the Permissions page for the object. It can only be disabled (by clicking Disable). If the knowledge object definition resides in the app's default directory. If you want restrict the ability to see or update a knowledge object by role. it can't be removed via Manager. You can arrange things so users in a particular role can use the knowledge object but not update it--or you can set it up so those users cannot see the object at all.Restrict knowledge object access by role You can use this method to lock down various knowledge objects from alteration by specific roles. the ability to delete knowledge objects in Manager really depends on a set of factors: • You cannot delete default knowledge objects that were delivered with Splunk (or with the App) via Manager. In Splunk. For more information about role-based permissions in Splunk see "About users and roles" in the Admin manual. be aware that you will lose any knowledge objects belonging to them that have a sharing status of private. In the latter case. Only objects 11 .

and the other is used to populate a summary index that is used by searches that run several other dashboard panels. Deleting knowledge objects with downstream dependencies You have to be careful about deleting knowledge objects with downstream dependencies. On the surface it would seem to be harmless to delete the dup tag. and which haven't been shared. The only way to identify the downstream dependencies of a particular knowledge object is to search on it. you need to have write permissions for the application to which they belong. the most efficient way to do it is by removing the knowledge object stanzas directly through the configuration files. This is why it is important to nip poorly named or defined knowledge objects in the bud. Deleting knowledge objects in configuration files Note that when you use manager. 12 . To sum up: the ability to edit a knowledge object has nothing to do with the ability to delete it.that exist in an app's "local" directory are eligible for deletion. • To delete all other knowledge objects. There is no "one click" way to bring up a list of knowledge object downstream dependencies at this point. far more common tag. and everything downstream of that event type breaks. and you're not sure if you've tracked down and fixed all of its downstream dependencies. to make local changes on a site-wide basis. If you really feel that you have to delete an knowledge object. and then search on those things to see where they are used--it can take a bit of detective work. But what you may not realize is that this duplicate tag also happens to be part of a search that a very popular event type is based upon. Keep in mind that several versions of a particular configuration file can exist within your system. This applies to knowledge objects that are shared globally as well as those that are only shared within an app--all knowledge objects belong to a specific app. So if you delete that tag. before they become inadvertently hard-wired into the workings of your deployment. no matter how they are shared. you could have a tag that looks like the duplicate of another. In most cases you should only edit the configuration files in $SPLUNK_HOME/etc/system/local/. find out where it is used. as this can have negative impacts. if you need to make changes that apply only to a specific app. you can only disable or delete one knowledge object at a time. • You can delete knowledge objects that you have created. or $SPLUNK_HOME/etc/apps/<App_name>/local/. unless you have write permissions for the app to which they belong (see the next point). If nothing seems to go seriously awry after a day or so. And that popular event type is used in two important saved searches--the first is the basis for a well-used dashboard panel. If you can't delete a particular knowledge object you may still be able to disable it. delete it. your ability to delete it is revoked. If you need to remove large numbers of objects. you could try disabling it first to see what impact that has. App-level write permissions are usually only granted to users with admin-equivalent roles. the event type breaks. For example. which essentially has the same function as knowledge object deletion without removing it from the system. Once a knowledge object you've created is shared with other users.

you'll find that they become easier to use and that their purpose is much easier to discern at a glance. Use the Common Information Model Splunk's Common Information Model provides strategies for normalizing your approach to extracted field names. And they can help identify a variety of things about the object that may not even be in the object definition. what technology it involves. event type tagging. but they can also help users differentiate between groups of saved searches. In the end you develop a naming convention that pulls together: • Group: Corresponds to the working group(s) of the user saving the search. see "Understand and use the Common Information Model" in this manual. and host tagging. If the naming conventions you develop are followed consistently by all of the Splunk users in your organization. and tags that have similar uses. Example . report. You can develop naming conventions for just about every kind of knowledge object in Splunk. such as what teams or locations use the object. • Search type: Indicates the type of search (alert. summary-index-populating) • Platform: Corresponds to the platform subjected to the search • Category: Corresponds to the concern areas for the prevailing platforms. and as the knowledge manager for your Splunk implementation. it's up to you to come up with a naming convention for the saved searches produced by your team. if it is a scheduled search).Set up a naming convention for saved searches You work in the systems engineering group of your company. It includes: • A list of standard custom fields • An event type tagging system • Lists of standard host tags For more information. and what it's designed to do. • Time interval: The interval over which the search runs (or on which the search runs. 13 . Naming conventions can help with object organization.Do not try to edit configuration files until you have read and understood the following topics in the Admin manual: • About configuration files • Configuration file precedence Develop naming conventions for knowledge objects Develop naming conventions for knowledge objects We suggest you develop naming conventions for your knowledge objects when it makes sense to do so. Early development of naming conventions for your Splunk implementation will help you avoid confusion and chaos later on down the road. event types.

event type tags. and host tags that Splunk uses when it processes most IT data. Normalizing the standard event format This is the recommended format that should be used when events are generated or written to a system: <timestamp> name="<name>" event_id=<event_id> <key>=<value> Any number of field key-value pairs are allowed. For example: 14 . Group SEG NEG OPS NOC Search type Platform Category Time interval Description <arbitrary> <arbitrary> Alert Windows Disk Report iSeries Exchange Summary Network SQL Event log CPU Jobs Subsystems Services Security Possible saved searches using this naming convention: • SEG_Alert_Windows_Eventlog_15m_Failures • SEG_Report_iSeries_Jobs_12hr_Failed_Batch • NOC_Summary_Network_Security_24hr_Top_src_ip Understand and use the Common Information Model Understand and use the Common Information Model The Common Information Model is based on the idea that you can break down most log files into three components: • fields • event type tags • host tags With these three components a savvy knowledge manager should be able to set up their log files in a way that makes them easily processable by Splunk and which normalizes noncompliant log files and forces them to follow a similar schema. Ensures the search name is unique. limited to one or two words if possible. The Common Information model details the standard fields.• Description: A meaningful description of the context and intent of the search.

the following PIX event: Sep 2 15:14:11 10.235. The identifier of the group affected by a change.208. or modification.168.4 src_port=12355 dest_ip=192. The user that was affected by a change.224. code 0) by access-group "internet_access_in" looks as follows: 2009-09-02 15:14:11 name="Deny icmp" event_id=106023 vendor=CISCO product=PIX log_level=4 dvc_ip=10. For more information about performing field extractions at search time.50. For example. The identifier of the user affected by a change.35 dest_port=22 The keys are ones that are listed in the "Standard fields below". For more information about the index time/search time distinction. When events coming from a CISCO PIX log are compliant with the Common Information Model format.50. The user group that is affected by a change.224. Please note that we strongly recommend that all of these field extractions be performed at search time.193 dvs_host=fw07 syslog_facility=local4 syslog_priority=warn src_ip=213.235. rhallen is the affected_user.2008-11-06 22:29:04 name="Failed Login" event_id=sshd:failure src_ip=10. name and event_id are mandatory.19. execution.70 (type 8. affected_user affected_user_group string string affected_user_group_id string affected_user_id number 15 .3.33 dest_ip=193. access. There is no need to add these fields to the set of default fields that Splunk extracts at index time.1. see "Create search-time field extractions" in this manual.2. see "Index time versus search time" in the Admin manual.70 src_network=internet dest_network=eservices-test-ses-public icmp_type=8 icmp_code=0 proto=icmp rule_number="internet_access_in" Standard fields This table presents a list of standard fields that can be extracted from event data as custom search-time field extractions. For example.208.33 dst eservices-test-ses-public:193. user fflanda changed the name of user rhallen. field name action data type string Explanation The action specified by the event.19.8.8.193 local4:warn|warning fw07 %PIX-4-106023: Deny icmp src internet:213.

A device-specific classification provided as part of the event. 802. IMAP. The destination command and control service channel. The fully qualified host name of a packet's recipient. SSH. The country associated with a packet's recipient. The TCP/IP port to which a packet is being sent. The number of times the record has been seen. number string string The NATed port to which a packet is being sent. The interface that is listening remotely or receiving packets locally. The destination command and control service port. The destination TCP/IP layer 2 Media Access Control (MAC) address of a packet's destination. The Common Vulnerabilities and Exposures (CVE) reference value. The free-form description of a particular event The name of the application being targeted. The destination command and control service name. ipv6 address The IPv6 address of a packet's recipient. The (physical) longitude of a packet's destination. The name of a given DHCP pool on a DHCP server 16 . The Windows NT domain containing a packet's destination. How many bytes this device/interface received. The remote DNS resource record being acted upon. The Windows NT host name of a packet's destination. number number mac address string string port string The (physical) latitude of a packet's destination. HTTPS. The DNS domain that is being queried.11 channel number used by a wireless network. app bytes_in bytes_out channel category count cve desc dest_app dest_cnc_channel dest_cnc_name dest_cnc_port dest_country dest_domain dest_host dest_int dest_ip dest_ipv6 dest_lat dest_long dest_mac dest_nt_domain dest_nt_host dest_port dest_record dest_translated_ip dest_translated_port dest_zone dhcp_pool string number number string string number string string string string string number string string string string ISO layer 7 (application layer) protocol--for example HTTP. ipv4 address The NATed IP address to which a packet is being sent. The DNS zone that is being received by a slave as part of a zone transfer.affected_user_privileges enumeration The privileges of the user affected by a change. this is the host header. ipv4 address The IPv4 address of a packet's recipient. How many bytes this device/interface transmitted. For HTTP sessions.

The HTTP user agent.direction duration dvc_host dvc_ip dvc_ip6 dvc_location dvc_mac dvc_nt_domain dvc_nt_host dvc_time end_time event_id file_access_time file_create_time file_hash file_modify_time file_name file_path file_permission file_size http_content_type http_method http_referrer http_response http_user_agent ip_version string number string The direction the packet is traveling. The location of the file that is the object of the event. A unique identifier that identifies the event. string MAC address string string timestamp timestamp number timestamp timestamp string timestamp string string string number string string string number string number The free-form description of the device's physical location. The MAC (layer 2) address of the device reporting the event. The time the file (the object of the event) was accessed. The event's specified end time. A cryptographic identifier assigned to the file object affected by the event. The HTTP referrer listed in the event. Access controls associated with the file affected by the event. The HTTP content type. The time the file (the object of the event) was created. MB.4 or 6. The size of the file that is the object of the event. The Windows NT domain of the device recording or transmitting the event. The fully qualified domain name of the device transmitting or recording the log record. The HTTP method used in the event. The amount of time the event lasted. The name of the file that is the object of the event. Time at which the device recorded the event. with not information related to local file or directory structure. Indicate whether Bytes. such as inbound or outbound. in terms of local file and directory structure. 17 . The Windows NT host name of the device recording or transmitting the event. This is unique to the reporting device. ipv6 address The IPv6 address of the device reporting the event. The time the file (the object of the event) was altered. GB. The numbered Internet Protocol version . KB. The HTTP response code. ipv4 address The IPv4 address of the device reporting the event.

An environment-specific assessment of the importance of the event. The product that generated the event. the Event Identifiers assigned 18 name object_name object_type object_handle outbound_interface packets_in packets_out pid string string string string string number number number priority number process product product_version proto reason recipient record_class record_type result rule_number sender severity signature string string number string string string string string string string string string string .historic). as well as the signature identifiers used by other Intrusion Detection Systems. etc. ICMP. How many packets this device/interface transmitted. The OSI layer 3 (network layer) protocol--for example IP. The network interface through which a packet was transmitted. The severity (or priority) of an event as reported by the originating device. The name of the event as reported by the device. message. The version of the product that generated the event. allowed/denied. The firewall rule-number or ACL number. The person responsible for sending an email message. event. such as IP addresses.length log_level number string The length of the datagram. IPsec. The log-level that was set on the device and recorded in the event.see Wikipedia article on DNS record types The result of the action . An integer assigned by the device operating system to the process creating the record. The name should not contain information that's already being parsed into other fields from the event. "timeout". "crash". business function of the affected system. The DNS resource record class . or other locally defined variables. The object name (associated mainly with Windows). or CH (Chaos ."connection refused".succeeded/failed. The object handle (associated mainly with Windows).default). HS (Hesiod . The SID. The person to whom an email message is sent. or packet.historic) The DNS resource record type . The root cause of the result . How many packets this device/interface received. based on elements such as event severity. The object type (associated mainly with Windows).IN (internet . The program that generated this record (such as a process name mentioned in the syslog header). ARP.

The (physical) longitude of the packet's source. 19 . URG. For Web logs. The local DNS resource record being acted upon. FIN. process. The network port from which a packet originated. The application. The translated/NAT'ed network port from which a packet is being sent. The TCP flag specified in the event. ACK. The DNS domain that is being remotely queried. The session identifier. number number mac address string string port string ip address number string string string timestamp string syslog facility syslog priority enumeration The (physical) latitude of the packet's source. The translated/NAT'ed IP address from which a packet is being sent. this is the http client. or PSH. ipv6 address The IPv6 address of the packet's source. The event's specified start time.by Windows-based operating systems to event records. or OS subsystem that generated the event. For Web logs. The DNS zone that is being transferred by the master as part of a zone transfer. src_country src_domain src_host src_int src_ip src_ipv6 src_lat src_long src_mac src_nt_domain src_nt_host src_port src_record src_translated_ip src_translated_port src_zone session_id ssid start_time subject syslog_facility syslog_priority tcp_flags string string string string ipv4 address The country from which the packet was sent. One or more of SYN. The Windows NT domain containing the machines that generated the event. RST. as recorded by UNIX syslog. Multiple transactions build a session. The interface that is listening locally or sending packets remotely. The Windows NT hostname of the system that generated the event. this is the http client.11 service set identifier (ssid) assigned to a wireless network. The IPv4 address of the packet's source. and Cisco's message IDs. The Media Access Control (MAC) address from which a packet was transmitted. The email subject line. The fully qualified host name of the system that transmitted the packet. The criticality of an event. The 802.

UDP. expressed in human-readable terms. What object has been targeted? Is the event talking about a host. This arrangement enables precise event type classification. System-assigned numeric identifier for the user affected by an event.wikipedia. The numeric identifier assigned to the user group event object. This convention requires that you set up three categories of tags. The categories are object.org/wiki/Type_of_Service). or guest/anonymous. and status. The object tag denotes what the event is about. modify. delete. a resource. The "Time To Live" of a packet or datagram. The transport protocol.tos transaction_id transport ttl url user user_group user_group_id user_id user_privilege user_subject user_subject_id user_subject_privilege vendor vlan_id vlan_name Standardize your event type tags hex string string number string string string string number enumeration string number enumeration string number string The hex bit that specifies TCP 'type of service' (see http://en. and that you give each event type in your system a single tag from each of these categories. The security context associated with the object of an event: one of administrator. The numeric identifier assigned to the virtual local area network specified in the record. The one executing the action. such as TCP. The vendor who made the product that generated the event. User that is the subject of an event. Was it successful? Failed? Or was it simply an attempt? In addition to these three standard tags. you can add other tags as well. A Web address (Uniform Record Locator. A user group that is the object of an event. The security context associated with a recorded event: one of administrator. The three tags in discussion here are: <objecttag> <actiontag> <statustag> 20 . The login ID affected by the recorded event. or guest/anonymous. and so on). The Common Information Model suggests that you use a specific convention when tagging your event types. or URL) included in a record. The transaction identifier. user. The name assigned to the VLAN in the event. a file. action. The one executing the action. And the status tag provides the status of the action. or what? The action tag explains what has been done to the object (create. user. ID number of the user that is the subject of an event.

Tag application application av application backdoor application database application database data application dosclient application firewall application im application peertopeer host group resource resource cpu resource file resources interface resource memory resource registry os os process os service user Explanation An application-level event. A host-level event. An event involving the system registry. An event involving an application firewall. A peer to peer-related event. An event using an application backdoor.Some examples of using the standard tags are: • For a firewall deny event type: host communicate_firewall failure • For a firewall accept event : host communicate_firewall success • For a successful database login: database authentication_verify success Object event type tags Use one of these object tags in the first position as defined above. An event involving network interfaces. An instant message-related event. An event involving memory. A user-level event 21 . An event involving the CPU. An event involving a file. An OS-level event. An event involving a DOS client. An event related to database data. A group-level event An event involving system resources. An event involving an OS-related process An event involving an OS service. An anti virus event. A database event.

An event that writes something. Deleting privileges. An event involving communication. Adding new priviliges. chmod. An event that starts something. access read decrypt access read download access write authentication authentication add authentication delete authentication lock authentication modify authentication verify authorization authorization add authorization delete authorization modify authorization verify check check status create communicate communicate connect communicate disconnect communicate firewall delete execute execute restart execute start execute stop An event that decrypts something. An event passing through a firewall. An event indicating an account lockout.. Tag access access read access read copy Explanation An event that accesses something. e. An event that downloads something. An event that deletes something.Action event type tags Use one of these action tags in the second position as defined above. An event that restarts something. An event that reads something. An event that copies something. An event involving authentication. Changing privileges. 22 . An event checking something's status. An event verifying identity. An event deleting authentication rules. An event that creates something. An event modifying authentication rules. An event adding authentication rules. An event involving disconnecting. An event involving making a connection. An event checking something. Checking privileges for an operation.g. An event that runs something. An event involving authorization. An event that stops something. access read copy archive An event that archives something.

inprogress An event marking something progress. An event that appends new content onto existing content. An event that clears out content. A failed event. An event that changes a configuration.modify modify attribute modify attribute rename modify configuration modify content modify content append modify content clear modify content insert modify content merge substitute Status event type tags An event that changes something. report success Optional tags A report of a status. A successful event. An event that replaces something. An event that changes an attribute. Tag attack attack exploit attack bruteforce attack dos attack escalation infoleak malware Explanation An event marking an attack. Use one of these status tags in the third position as defined above. An event marking malware action. malware dosclient An event marking malware utilizing a DOS client. An event that renames something. For those who want to use standard additional tags when they apply. An event marking the use of an exploit. A deferred event. malware trojan malware virus An event marking a trojan. An event indicating a privilege escalation attack. An event marking a brute force attack. A content-related event. 23 . An event that merges content. Tag attempt deferred failure Explanation An event marking an attempt at something. An event marking a virus. An event that inserts content into existing content. some suggestions are below. An event marking a denial of service attack. malware spyware An event marking spyware. An event indicating an information leak.

You can use standardized tags to describe specific hosts and what they do. • What data the host contains. • What OS the host is running. There are a variety of approaches to host tagging. all of which can be used where appropriate. Tag db Explanation This host is a database. • What cluster/round robin the host belongs to. You can also develop lists of host tags that are appropriate for specific apps.malware worm recon suspicious Standardize your host tags An event marking a worm. This host is a DNS server. web This host is a Web server. An event indicating suspicious activity. General host tags These host tags are useful across the board. This host is a firewall. changes to host names are not applied to data that has already been indexed. It's far easier to use tags to group together events from particular hosts. As you may know. Some of these methods include: • What service(s) the host is running. highly_critical This host is highly critical for business purposes. This host is an email server. This host contains financial information. Because hosts are identified before event data is indexed. dmz dns email finance firewall This host is in the DMZ. it can be problematic to rename hosts directly. development This host is a development box. • The department the host belongs to. 24 . An event marking recon probes.

either through the Interactive Field Extractor. host=foo is a way of indicating that you are searching for events with host fields that have values of foo. Splunk won't seek out events with different host field values. • performs custom search field extractions that you have defined. These fields show up in the Field Picker after you run a search. If your search is sourcetype=veeblefetzer 25 . or search commands such as rex.1. which follow event-extraction rules that you define): Say you search on sourcetype. This means that this search gives you a more focused set of search results than you might get if you just put foo in the search bar. It also won't look for events containing other fields that share foo as a value. and sourcetype.1. Default fields are common to all events. first at index time. these are fields that you have configured for index-time extraction. (This 50 field limit is a default that can be modified by editing the [kv] stanza in limits. a default field that Splunk automatically extracts for every event at index time. For an explanation of "search time" and "index time" see "Index time versus search time" in the Admin manual. When you run this search.") For example.168.conf. it extracts and defines fields from that data. It: • automatically identifies and extracts the first 50 fields that it finds in the event data that match obvious name/value pairs.) • extracts any field explicitly mentioned in the search that it might otherwise have found though automatic extraction (but isn't among the first 50 fields identified). such as user_id=jdoe or client_ip=192. At index time Splunk extracts a small set of default fields for each event. the Extracted fields page in Manager. ("Name/value pairings" are sometimes referred to as "key/value pairings. All fields have names and can be searched with those names. As Splunk processes event data. and again at search time.Data interpretation: Fields and field extractions About fields About fields Fields are searchable name/value pairings in event data. source. look at the following search: host=foo In this search. configuration file edits. At search time Splunk automatically extracts certain fields. An example of automatic field extraction This is an example of how Splunk automatically extracts fields without user help (as opposed to custom field extractions. which it extracts as examples of user_id and client_ip fields. including host. Splunk can also extract custom indexed fields at index time.

However. but which isn't being discovered and extracted by Splunk automatically. and it will be available in the Fields Picker along with the other fields that Splunk has extracted for this search. and userlogin isn't among the set of custom fields that you've preconfigured. 26 . and you may define specialized groups of custom search fields yourself. From this set of events. based on configuration files. you will encounter situations that require the creation of new fields that will be additions to the set of fields that Splunk automatically extracts for you at index time and search time.for the past 24 hours. it likely won't be among the first 50 fields that Splunk finds on its own.000 events into the search.conf and transforms. however. • design and manage search-time field transforms through Splunk Manager. Splunk retuns every event with a sourcetype of veeblefetzer in that time range. As you use Splunk. you'll oversee the set of custom search field extractions created by users of your Splunk implementation. As a knowledge manager. • configure Splunk to parse multivalue fields. if you change your search to sourcetype=veeblefetzer userlogin=* Then Splunk will be smart enough to find and return all events including both the userlogin field and a sourcetype value of veeblefetzer. Add and maintain custom search fields To fully utilize the power of Splunk IT search. if a name/value combination like userlogin=fail appears for the first time 25. Splunk automatically extracts the first 50 fields that it can identify on its own. All of these fields will appear in the Field Picker when the search is complete. you need to know how to create and maintain custom search field extractions. You'll learn how to: • create and administrate search-time field extractions through Splunk Manager. Now. • use the props. Custom fields enable you to capture and track information that is important to your needs. And it performs extractions of custom fields. This section of the Knowledge Manager manual discusses the various methods of field creation and maintenance (see the "Overview of search-time field extraction" topic) and provides examples showing how this functionality can be used.conf configuration files to add and maintain search-time extractions . Overview of search-time field extraction Overview of search-time field extraction This topic provides a brief overview of Splunk Web field extraction methods.

For a detailed discussion of search-time field addition using methods based in Splunk Web.) If you find that you need to create additional search-time field extractions. because it will generate field extraction regexes for you (and enable you to test them). Use Splunk Manager to add and maintain field extractions You can use the Field extractions and Field transformations pages in Splunk Manager to review. you may also want to define field extractions as part of an event data normalizaton strategy. you'll be managing field extractions for the rest of your team. see "Extract fields interactively in Splunk Web" in the User manual. We'll just summarize the methods in this subtopic and provide links to topics with in-depth discussions and examples. where you redefine existing fields and create new ones in an effort to reduce redundancies and increase the overall usability of the fields available to other Splunk users on their team. 27 . reports.conf) and more advanced search-time extractions that reference a field transformation component in transforms. and dashboards. (For more information. edit.conf. The Field extractions page allows you to review. Note: IFX is especially useful if you are not familiar with regular expression syntax and usage. and create field extractions. You can edit existing extractions and create new ones. IFX enables you to extract only one field at a time (although you can edit the regex it generates later to extract multiple fields). IFX enables you to quickly turn any search into a field extracting regular expression. run a search and then select "Extract fields" from the dropdown that appears beneath timestamps in the field results. You can define field transformations in Manager through the Field transformations page (see below). For more information about using IFX. update. The Field extractions page The Field extractions page shows you the search-time field extractions in props.As a knowledge manager. The search language also enables you to create temporary field extractions. You can use it to create and manage both basic "inline" search-time extractions (extractions that are defined entirely within props. see "Understand and use the Common Information Model. and create extracted fields. Splunk Web provides a variety of search-time field extraction methods.conf. You use IFX on the local indexer." in this manual. Use interactive field extraction to create new fields You can create custom fields dynamically using the interactive field extractror (IFX) in Splunk Web. To access IFX. you have a number of ways to go about it. see "Extract and add new fields" in the User manual. In many cases you'll be defining fields that Splunk has not identified on its own. However. And you can always add and maintain field extractions by way of configuration file edits. in effort to make your event data more useful for searches.

conf You can also create and maintain field extractions by making edits directly to props. The Field transformations page You can also use Manager to create more complex search-time field extractions that involve a transform component in transforms. you couple an extraction from the Field extractions page with a field transform on the Field transformations page. see "Use the Field transformations page in Manager". For example with the config files you can you can set up: • Delimiter-based field extractions. source types. apply multiple field transforms to the same field extraction). or just prefer working at the configuration file level of things. If this sounds like your kind of thing--and it may be. • Extractions of fields with names that begin with numbers or underscores (normally not allowed unless key cleaning is disabled). For more information. • Formatting of extracted fields.conf and transforms.conf to enable advanced field extractions. configure one field transform for multiple field extractions). or host (in other words. • Use a regular expression to extract fields from the values of another field (also referred to as a "source key"). especially if you are an old-timey Splunk user. • Apply more than one field-extracting regular expression to the same source. 28 . With transforms. In Splunk Web. Configure field extractions in props. To do this. source type. you navigate to the Field transformations page by selecting Manager > Fields > Field transformations. Field transforms work with extractions set up in props. For more information. or hosts (in other words.conf. It's important to note that the configuration files do enable you to do more things with search-time field extractions than Manager currently does.In Splunk Web. see "Use the Field extractions page in Manager". you can find all the details in "Create and maintain search-time extractions through configuration files." in this manual.conf and transforms.conf. • Extractions for multivalue fields. The Field transformations page displays search-time field transforms that have been defined in transforms.conf. you navigate to the Field extractions page by selecting Manager > Fields > Field extractions. you can define field extractions that • Reuse the same field-extracting regular expression across multiple sources.

Field extractions created through the Interactive Field Extractor and the Field extractions page are initially only available to their creators until they are shared with others.conf. 29 . For more information about deleting knowledge objects. it looks for a sales_order. For details about how these commands are used. • The extract (or kv.form file. • kvform extracts field/value pairs from events based on predefined form templates that describe how the values should be extracted. • Create new search-time field extractions. through direct props. These templates are stored in $SPLUNK_HOME/etc/system/form/.. if form=sales_order. You can use extract to test any field extractions that you plan to add manually through conf files.. if your app-level permissions enable you to do so. The Field extractions page enables you to: • Review the overall set of search-time extractions that you have created or which your permissions enable you to see. Splunk extracts fields using field extraction stanzas that have been added to props. along with examples. It creates a new event for each table row and derives field names from the table title.Use search commands to create field extractions Splunk provides a variety of search commands that facilitate the extraction of fields in different ways. Default knowledge objects cannot be deleted. tabular-formatted events. see either the Search Reference or the "Extract and add new fields" topic in the User manual. Field extractions can be added to props. Use the Field extractions page in Manager Use the Field extractions page in Manager Use the Field extractions page in Manager to manage search-time field extractions that have been added to props. or your own custom application directory in $SPLUNK_HOME/etc/apps/. For example./form.conf. and if they are not default extractions that were delivered with the product. such as transactions from webpages. • Use multikv to extract field/value pairs from multiline.conf edits.conf when you use the Interactive Field Extractor. When Splunk encounters an event with error_code=404. • Update permissions for field extractions. • Delete field extractions. see "Curate Splunk knowledge with Manager" in this manual. • xmlkv enables you to extract field/value pairs from xml-formatted event data. If you use extract without specifying any arguments. for all Apps in your instance of Splunk. Here's a list of these commands: • The rex search command performs field extractions using a Perl regular expression with named groups named groups that you include in the search string. and when you create field extractions through the Field extractions page. for "key/value") search command extracts field/value pairs from search results. to see if they extract field/value information as expected. Splunk matches all of the events it processes against that form in an effort to extract values.

This kind of extraction is always associated with a 30 . They are created automatically by field extractions made through IFX and certain search commands. see "Manage field transforms" in this manual. Name column The Name column in the Field extractions page displays the overall name of the field extraction.conf component called a field transform.conf. if it uses transactions. For more information about transforms and the Field transforms page. EXTRACT-<name> field extractions are extractions that are wholly defined in props. where <host> is the host for an event.conf files see "Add fields at search time" in this manual. it helps to understand how field extractions are set up in your props.conf and transforms.conf files.conf and transforms. the Field extractions page enables you to: • Update its regular expression. For more information about index-time field extraction configuration. The format is: <spec> : [EXTRACT-<name> | REPORT-<name>] • <spec> can be: ♦ <sourcetype>.conf (in other words.conf and transforms.conf configuration files manually. But some field extractions include a transforms. ♦ host::<host>. but if you find that you must do so. Note: You cannot manage index-time field extractions via Manager. you have to modify your props.If you have "write" permissions for a particular search-time field extraction. where <source> is the source for an event. if it is an inline transaction. they do not reference a transform in transforms. We don't recommend that you change your set of index-time field extractions. see "Configure index-time field extractions" in the Getting Data In manual. Field extractions can be set up entirely in props. To create/edit that component of the field extraction via Splunk Web. You can also add them by making direct updates to the props. ♦ source::<source>.conf file. the source type of an event.conf or the Field transactions page in Manager. as it appears in props.conf. Navigate to the Field extractions page by selecting Manager > Fields > Field extractions. For more information about field extraction setup directly in the props.conf. in which case they are identified on the Field extractions page as inline field extractions. • Add or delete named extractions that have been defined in transforms. you use the Field transactions page in Manager. Review search-time field extractions in Manager To better understand how the Field extractions page in Manager displays your field extraction.

These may appear in props. • In the case of Uses transform extraction types. You can define field transforms directly in transforms. Type column There are two field extraction types: inline and transforms.conf as: [access_combined] REPORT-access = access-extractions. access-extractions and ip-extractions are both names of field transform stanzas in transforms. they do not reference external field transforms.conf. the referenced field transform stanza is indicated in the "Extraction/Transform" stanza. For more information see "Use the Field Transformations page in Manager" in this manual.conf. REPORT-<value> field extractions reference field transform stanzas in transforms. For a primer on regular expression syntax and usage. this regex appears in the Extraction/Transform column. Manager displays different things depending on the field extraction Type. go to the Field transforms page. They are identified as such because they are entirely defined within </code>props. see Regular-Expressions. For example. On the Field extractions page. Extraction Transform column In the Extraction/Transform column.conf or via Manager using the Field transformations page.conf</code>.field-extracting regular expression. • Uses transform extractions always have REPORT-<value> name configurations. You can work with transforms in Manager through the Field transforms page. Manager displays the name of the transforms. • For inline extraction types. • Inline extractions always have EXTRACT-<name> configurations.conf. To work with those field transforms through Manager. Splunk also maintains a list of useful third-party tools for writing and testing regular expressions.conf</code>. You can test your regex by using it in a search with the rex search command. On the Field extractions page. ip-extractions In this example. This can be necessary in cases where the field or fields that you want to extract appear in two or more very different event patterns. Manager displays the regular expression that Splunk uses to extract the field. The named group (or groups) within the regex show you what field(s) it extracts. As such they reference field transforms in </code>transforms. source type. A field extraction can reference multiple field transforms if you want to apply more than one field-extracting regex to the same source. the Expression column could display two values for a Uses transform extraction: access-extractions and ip-extractions. This is where their field-extracting regular expressions are located.info.conf. or host. 31 .conf field transform stanza (or stanzas) that the field extraction is linked to through props.

By default it will be the app context you are currently in.conf. All of the fields described below are required. • Valid characters for field names are a-z. see "Create and maintain search-time field extractions through configuration files" in this manual. Select sourcetype. Define the extraction type. If you select Inline enter the regular expression used to extract the field (or fields) in the Extraction/Transform field. Leading underscores are reserved for Splunk's internal variables. Splunk also maintains a list of useful third-party tools for writing and testing regular expressions. or host to which the extraction applies. see Regular-Expressions. If you know how field extractions are set up in props. 32 .conf this is the <name> value for an EXTRACT or REPORT field extraction type. In props. Note: You cannot turn off key cleaning for inline field extractions (field extractions that do not require a field transform component). • All leading underscores and 0-9 characters are removed from extracted field names. You can test your regex by using it in a search with the rex search command. Give the field extraction a Name. either by default or through a custom configuration: • All characters that are not in a-z.conf. 0-9. using underscores for spaces between words. For a primer on regular expression syntax and usage. 3. Splunk applies the following "key cleaning" rules to all extracted fields. Important: The capturing groups in your regex must identify field names that only contain alpha-numeric characters or underscores. A-Z. For more information. If you select Uses transform enter the transform(s) involved in the Extraction/Transform field. 2. you have to manually modify both props. 4. or _ . To disable this behavior for a specific field extraction. source. This maps to the <spec> value in props.conf. The Add New page appears. Define a Destination app context for the field extraction. you should find this to be pretty simple. source. and 0-9 ranges are replaced with an underscore (_). Define the sourcetype.Add new field extractions Click the New button at the top of the Field extractions page to add a new field extraction.conf and transforms. 1. • Field names cannot begin with 0-9 or _ . A-Z. • International characters are not allowed. or host and enter the value.info.

• If the field extraction is an inline extraction. In props. click locate the field extraction and click its name in the Name column. The field should be extracted from events related to the testlog source type.Example . 33 . Update existing field extractions To edit an existing field extraction. you can edit the regular expression it uses to extract fields.conf file. This takes you to a details page for that field extraction. which shows you how to set up field extractions using the props.Add a new error code field This shows how you would define an extraction for a new err_code field. The field can be identified by the occurrence of device_id= followed by a word within brackets and a text string terminating with a colon.conf this extraction would look like: [testlog] EXTRACT-<errors> = device_id=\[w+\](?<err_code>[^:]+) Here's how you would set that up through the Add new field extractions page: Note: You can find a version of this example in "Create and maintain search-time field extractions" topic in this manual. In the Extraction/Transform field what you can do depends on the type of extraction that you are working with.

Note: Uses transform field extractions must include at least one valid transforms. For more information about managing permissions with Manager. This opens the standard permission management page used in manager for knowledge objects. 34 . Field transforms can be created either through direct edits to transforms. all of those other knowledge objects will be negatively impacted by the removal of that extraction from the system. Update field extraction permissions When a field extraction is created through an inline method (such as IFX or a search command) it is initially only available to its creator. For example.) The transforms can then be created or updated via the Field transforms page. which reside in transforms. Click Delete for the field extraction that you want to remove.conf or by addition through the Field transformations page. But "inline" field extractions do not need to have a field transform component. or globally to users of all Apps.• If the field extraction uses one or more transforms. and determine whether it is available to users of one specific App. you need to update its permissions. You won't be able to delete default field extractions (extractions that were delivered with the product and which are stored in the "default" directory of an app). you can delete field extractions if your permissions enable you to do so. Note: Take care when deleting objects that have downstream dependencies. you can specify the transform or transforms involved (put them in a comma-separated list if there is more than one. To make it so that other users can use the field extraction." in this manual. To do this. see "Curate Splunk knowledge with Manager. see "Curate Splunk knowledge with Manager" in this manual. if your field extraction is used in a search that in turn is the basis for an event type that is used by five other saved searches (two of which are the foundation of dashboard panels). locate the field extraction on the Field extractions page and select its Permissions link. Note: Every field transform has at least one field extraction component. Delete field extractions On the Field extractions page in Manager. On this page you can set up role-based permissions for the field extraction.conf field extraction stanza name.conf. For more information about deleting knowledge objects. Use the Field transformations page in Manager Use the Field transformations page in Manager The Field transformations page in Manager enables you to manage the "transform" components of search-time field extractions.

• Delete field transforms.conf and 35 .conf component called a field transform. Default knowledge objects cannot be deleted. you only have to do so once. configure one field transform for multiple field extractions). source types. When to use the Field transformations page While you can define most search-time field extractions entirely within props. You can only update field transform permissions if you own the transform. This component can be defined and managed through the Field transforms page. Field transforms created through the Field transformations page are initially only available to their creators until they are shared with others. apply multiple field transforms to the same field extraction). You cannot manage index-time field extractions via Manager. If you find yourself using the same regex to extract fields for different sources. if you find that you need to update the regex. source type. and have that be a value of a new field. or host (in other words. some advanced search-time field extractions require a transforms. • Use a regular expression to extract fields from the values of another field (also referred to as a "source key"). If you have "write" permissions for a particular field transform. see "Curate Splunk knowledge with Manager" in this manual." below. if your app-level permissions enable you to do so. even though it is used more than one field extraction. Then. For example. • Create new search-time field transforms. or hosts (in other words. and hosts. however--you have to use the props.The Field transformations page enables you to: • Review the overall set of field transforms that you have created or which your permissions enable you to see. you might pull a string out of a url field value. source types. You set up search-time field extractions with a field transform component when you need to: • Reuse the same field-extracting regular expression across multiple sources. Navigate to the Field transformations page by selecting Manager > Fields > Field transforms. and if they are not default field transforms that were delivered with the product. for all Apps in your instance of Splunk. For more information about situations that call for the use of field transforms. For more information about deleting knowledge objects. or if your role's permissions enable you to do so.conf (or the Field extractions page in Manager). • Define or update the field transform format. • Apply more than one field-extracting regular expression to the same source. the Field transformations page enables you to: • Update its regular expression and change the key the regular expression applies to. you may want to set it up as a transform. Note: All index-time field extractions are coupled with one or more field transforms. This is sometimes necessary in cases where the field or fields that you want to extract from a particular source/source type/host appear in two or more very different event patterns. see "When to use the Field transformations page. • Update permissions for field transforms.

key. not just fields within that event.conf files.. see "Configure index-time field extractions" in the Admin manual.. Note: By default.conf./banner_access_log*] REPORT-banner = banner This means the regex is only matched to uri fields in events coming from the .transforms./banner_access_log* like so: [source::. and login. A typical field transform looks like this in transforms..conf. and extracts three fields as named groups: license_type.conf: [banner] REGEX = /js/(?<license_type>[^/]*)/(?<version>[^/]*)/login/(?<login>[^/]*) SOURCE_KEY = uri This transform matches its regex against uri field values.conf). These names are the stanza names in transforms. Review and update search-time field transforms in Manager To better understand how the Field transformation page in Manager displays your field transforms. It's also important to note that you can do more things with search-time field transforms (such as setting up delimeter based field extractions and configuring extractions for multivalued fields) if you configure them directly within transforms..conf and transforms.. in which case their regexes are applied to the entire event. sourcetypes. it helps to understand how search-time field extractions are set up in your props./banner_access_log source. and event format.conf. For more information about index-time field extraction configuration. Here's the details page for the banner transform that we described at the start of this 36 . transforms are matched to a SOURCE_KEY value of _raw. See the section on field transform setup in "Create and maintain search-time field extractions through configuration files" in this manual for more information.conf configuration files. and hosts if necessary. This is something you can't do with inline field extractions (field extractions set up entirely within props. The transform example presented above appears in the list of transforms as banner. that transform is matched to the source . Reviewing and editing transform details The details page for a field transform enables you to view and update its regular expression.. But you can match it to other sources. version. The Name column The Name column of the Field transformations page displays the names of the search-time transforms that your permissions enable you to see. In props. Click on a transform name to see the detail information for that particular transform.

key. you could have an event that contains strings for a field name and its corresponding field value. (This is a required field. When you save this transform this is the name that appears in the Name column on the Field transformations page. 3. (This is a required field. and then you use the FORMAT of $1::$2 to have the first string be the field name. Optionally define a Key for the transform.conf.conf.conf and the Field extractions page. see Regular-Expressions. you can edit the regex. Identify the Destination app for the field transform. This corresponds to the SOURCE_KEY option in transforms. which means the regex is applied to entire events. Splunk also maintains a list of useful third-party tools for writing and testing regular expressions.conf. Create a new field transform To create a new field transform: 1. By default it is set to _raw. replace _raw with the name of that field. For example. and event format. You can only use fields that are present when the field transform is executed. This equates to the stanza name for the transform on transforms. Give the field transform a Name. 6. 2. To have the regex be applied to values of a specific field.) 5. You can test your regex by using it in a search with the rex search command. sourcetype. 37 . Regular expression syntax and usage For a primer on regular expression syntax and usage. This corresponds to the FORMAT option in transforms. navigate to the Field transformations page and click the New button. if it is not the app you are currently in. or host. and the second string be the field value. You first design a regex that extracts those strings. Optionally specify the Event format. Keep in mind that these edits can affect multiple field extractions defined in props.) 4. if the transform has been applied to more than one source.info. First.subtopic: If you have the permissions to do so. Enter a Regular expression for the transform.

On the Field Extractions page in Manager. field transforms are always associated with a field extraction. When key cleaning is enabled (it is enabled by default). using a transform that is delivered with Splunk.Important: The capturing groups in your regex must identify field names that contain alpha-numeric characters or an underscore. A-Z. Example . you have to manually modify both props. either by default or through a custom configuration: 1. For more information. 0-9. • Valid characters for field names are a-z.conf. Note: You cannot turn off key cleaning for inline field extractions (field extractions that do not require a field transform component). 2. you can see that the bracket-space field transform is 38 . Here's an example. • Field names cannot begin with 0-9 or _ . The bracket-space field transform has a regular expression that finds field name/value pairs within brackets in event data. All characters that are not in a-z. A-Z. To disable this behavior for a specific field extraction. and 0-9 ranges are replaced with an underscore (_). Splunk removes all leading underscores and 0-9 characters from extracted fields. Leading underscores are reserved for Splunk's internal variables. As we stated earlier in this topic. Splunk applies the following "key cleaning" rules to all extracted fields when they are extracted at search-time.conf and transforms. or _ . It will reapply this regular expression until all of the matching field/value pairs in an event are extracted. see "Create and maintain search-time field extractions through configuration files" in this manual.Extract both field names and their corresponding field values from an event You can use the Event format attribute in conjunction with a properly designed regular expression to set up field transforms that extracts both a field name and its corresponding field value from each matching event. • International characters are not allowed.

especially those who have been using Splunk for some time. For example. For more information about deleting knowledge objects. Click Delete for the field extraction that you want to remove. because those are the configuration files that the Field extractions and Field transforms pages in Manager read from and write to." in this manual. maintain. Note: Take care when deleting knowledge objects that have downstream dependencies. or globally to users of all Apps. you can delete field transforms if your permissions enable you to do so. see "Curate Splunk knowledge with Manager. On this page you can set up role-based permissions for the field transform. This topic shows you how you can: • Set up basic "inline" search-time field extractions through edits to props. To make it so that other users can use the field transform. which can be used to add. you need to update its permissions. and determine whether it is available to users of one specific App. Delete field transforms On the Field transformations page in Manager. see "Curate Splunk knowledge with Manager" in this manual. Update field transform permissions When a field transform is first created. all of those other knowledge objects will be negatively impacted by the removal of that transform from the system. and review libraries of custom field additions for their teams.conf and transforms. Many knowledge managers. For more information about managing permissions with Manager. Create and maintain search-time field extractions through configuration files Create and maintain search-time field extractions through configuration files While you can now set up and manage search-time field extractions via Splunk Manager. it's important to understand how they are handled at the props.conf. locate the field transform on the Field transformations page and select its Permissions link. by default it is only available to its creator. This opens the standard permission management page used in Manager for knowledge objects. To do this. find it easier to manage their custom fields through configuration files.conf level. if the field extracted by your field transform is used in a search that in turn is the basis for an event type that is used by five other saved searches (two of which are the foundation of dashboard panels).associated with the osx-asl:REPORT-asl extraction. 39 .

when you set up field extractions manually through configuration files. If you set up your regular expressions manually.conf. • International characters are not allowed. A-Z.conf in $SPLUNK_HOME/etc/system/local/. either by default or through a custom configuration: 1. Splunk removes all leading underscores and 0-9 characters from extracted fields. Splunk applies the following "key cleaning" rules to all extracted fields when they are extracted at search-time." above. You can find props.conf configuration file. to extract fields from event data.conf edits You can create basic search-time field extractions by editing the props.info.conf stanza.• Design more complex search-time field extractions through a combination of edits to props. A-Z. Create basic search-time field extractions with props. Regular expressions and field name syntax Splunk uses regular expressions. you can design them so that they extract two or more fields from matching events if necessary. You can disable this by setting CLEAN_KEYS=false in the transforms. All characters that are not in a-z. • Field names cannot begin with 0-9 or _ . 0-9. For a primer on regular expression syntax and usage. 2. When key cleaning is enabled (it is enabled by default). see Regular-Expressions. Use proper field name syntax Splunk only accepts field names that contain alpha-numeric characters or an underscore: • Valid characters for field names are a-z. you have to provide the regex yourself--but you can set up regexes that extract two or more fields at once if necessary. but it can only create regular expressions that extract one field. Leading underscores are reserved for Splunk's internal variables. or regexes. or _ . You can test your regex by using it in a search with the rex search command. When you use the interactive field extractor (IFX). See "When Splunk creates field names. or your own custom application 40 . On the other hand. Splunk also maintains a list of useful third-party tools for writing and testing regular expressions.conf stanza for the extraction. Important: The capturing groups in your regex must identify field names that contain alpha-numeric characters or an underscore. and 0-9 ranges are replaced with an underscore (_). You can disable key cleaning for a particular field transform by setting CLEAN_KEYS=false in the transforms. Splunk attempts to generate regexes for you.conf and transforms.

source type. If your field value is a portion of a word.conf and link it to the source.conf Follow this format when adding a field extraction stanza to props. ♦ host::<host>. and sourcetypes. 5. Restart Splunk for your changes to take effect. the source type of an event. Start by identifying the sourcetype. or host.) Note: Do not edit files in $SPLUNK_HOME/etc/system/default/.conf: [<spec>] EXTRACT-<class> = <regular expression> • <spec> can be: ♦ <sourcetype>. source type. Write a regular expression to extract the field from the event.directory in $SPLUNK_HOME/etc/apps/. Steps for defining basic custom field extractions with props. the class for source wins out. you must also add an entry to fields. Note: Do not edit files in $SPLUNK_HOME/etc/system/default/. sources. where <source> is the source for an event. 4. or host that you identified in the first step. and more)" in the Admin manual. For more information on configuration files in general. source. 41 .conf. see "About configuration files" in the Admin manual.conf are restricted by a specific source. 5. where <host> is the host for an event. ♦ source::<source>. Splunk takes the configuration from the highest precedence configuration block. Determine a pattern to identify the field in the event. (We recommend using the latter directory if you want to make it easy to transfer your data customizations to other search servers.conf file in $SPLUNK_HOME/etc/system/local/. ♦ If a particular class is specified for a source and a sourcetype. Precedence rules for classes: ♦ For each class. 2. • <class> is the extraction class. see "About default fields (host. Add your regex to props. All extraction configurations in props.conf 1. Edit the props. or host that provide the events from which you would like your field to be extracted. Note: For information hosts. or your own custom application directory in $SPLUNK_HOME/etc/apps/. Add a regex stanza to props. See the example "create a field from a subtoken" below. source type. source. 3.

conf uses EXTRACT-<class>.conf only) search-time field extraction examples Here are a set of examples of search-time custom field extraction. Note: Unlike the procedure for configuring the default set of fields that Splunk extracts at index time. In props. The field should be extracted from events related to the testlog source type.♦ Similarly. changed state to down The stanza in props. Splunk follows precedence rules when it runs search-time field extractions. which are always constructed both in props.conf. add: [testlog] EXTRACT-<errors> = device_id=\[w+\](?<err_code>[^:]+) Extract multiple fields using one regex This is an example of a field extraction that pulls out five separate fields. it overrides that class in ./default/.conf only. Note: For "inline" search-time field extractions. The regex is required to have named capturing groups. this changes. use TRANSFORMS-<value>. if a particular class is specified in . When transforms are involved. You can then use these fields in concert with some event types to help you find port flapping events and report on them.conf and transforms. Inline (props. Here's a sample of the event data that the fields are being extracted from: #%LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet9/16. • <regular_expression> = create a regex that recognizes your custom field value. It runs inline field extractions (EXTRACT-<class>) first.conf requires no DEST_KEY since nothing is being written to the index during search-time field extraction. And index-time field extractions. The field can be identified by the occurrence of device_id= followed by a word within brackets and a text string terminating with a colon. Fields extracted at search time are not persisted in the index as keys.conf... each group represents a different extracted field. and then runs field extractions that involve transforms (REPORT-<class>). transforms.conf for the extraction looks like this: [syslog] 42 .conf. Add a new error code field This example shows how to create a new "error code" field by configuring a field extraction in props. props. which are defined entirely within props. Search-time field extractions using transforms use REPORT-<value> (see the section on complex field extractions for more info).conf./local/ for a <spec>. set up using props.

add an entry to fields. port. Field transforms are always created in conjunction with field extraction stanzas in props. Use tags to define a couple of event types in eventtypes.conf--they cannot stand alone.conf as explained above.host. some advanced search-time field extractions require an additional component called a field transform. Configure props. your field's value is "123" but it occurs as "foo123" in your event.conf that ties much of the above together to find port flapping and report on the results: [port flapping] search = eventtype=cisco_ios_port_down OR eventtype=cisco_ios_port_up starthoursago=3 | stats c interface.conf: [cisco_ios_port_down] search = "changed state to down" [cisco_ios_port_up] search = "changed state to up" Finally. The following two steps aren't required for field extraction--they show you what you might do with the extracted fields to find port flapping events and then report on them. slot.conf. Then. For example. Create advanced search-time field extractions with field transforms While you can define most search-time field extractions entirely within props. \sstate\sto\s(?<port_status>up|down) Note that five separate fields are extracted as named groups: interface.EXTRACT-<port_flapping> = Interface\s(?<interface>(?<media>[^\d]+)(?<slot>\d+)\/(?<port>\d+))\. This section shows you how to configure field transforms in transforms.conf: [<fieldname>] INDEXED = False INDEXED_VALUE = False • Fill in <fieldname> with the name of your field. create a saved search in savedsearches.port_status | sort -count Create a field from a subtoken If your field value is a smaller part of a token.conf." • Set INDEXED and INDEXED_VALUE to false.conf. and port_status. [url] if you've configured a field named "url. media. 43 . Field transforms contain a field-extracting regular expression and other attributes that govern the way the transform extracts fields. ♦ For example. ♦ This tells Splunk that the value you're searching for is not a token in the index. you must add an entry to field.

• Apply special formatting to the information being extracted. and more)" in the Admin manual.2. source types. source types. For more information. see "Configure index-time field extractions" in the Admin Manual. apply multiple field transforms to the same field extraction). and tab spaces. Define a regular expression that uses this pattern to extract the field from the event. 2. Splunk appends additional field values to the field as it finds them in the event data. or hosts. or host (in other words. source type.1. If you find yourself using the same regex to extract fields for different sources. or hosts (in other words. however. (Note: If your event lists field/value pairs or just field values. 3. For example. you only have to do so once. and hosts. • Apply more than one field-extracting regular expression to the same source. configure one field transform for multiple field extractions). even though it is used more than one field extraction. or host. • Set up delimiter-based field extractions.conf are restricted by a specific source. but only if you set it up as an index-time extraction. • Configure extractions for multivalued fields. by using the FORMAT attribute. see the "Define a field transform" section below for more information about how to do this. you can extract it at index time as an ip address field value in the format 192. This is sometimes necessary in cases where the field or fields that you want to extract from a particular source/source type/host appear in two or more very different event patterns. Determine a pattern to identify the field in the event. line breaks. Start by identifying the source. source type. you may want to set it up as a transform.Your search-time field extractions require a field transform component if you need to: • Reuse the same field-extracting regular expression across multiple sources. you can create a delimiter-based field extraction that 44 . sourcetype. When you do this. Then. if you find that you need to update the regex. if you have a string like 192(x)0(y)2(z)1 in your event data. you can do this with the FORMAT attribute. colons. Steps for defining custom search-time field extractions with field transforms 1. source types. However we DO NOT RECOMMEND that you make extensive changes to your set of indexed fields--do so sparingly if at all. see "About default fields (host. • Extract fields with names that begin with numbers or underscores. NOTE: If you need to concatenate a set of regex extractions into a single field value. bars. source. but you can configure your transform to turn this functionality off if necessary. or host that provide the events from which you would like your field to be extracted. Note: For more information about sources. Delimiter-based extractions come in handy when your event data presents field-value pairs (or just field values) that are separated by delimiters such as commas.0. You can also configure transforms to: • Extract fields from the values of another field (other than _raw) by using the SOURCE_KEY attribute. source type. All extraction configurations in props. Both of these configurations can now be set up directly in the regex. Ordinarily key cleaning removes leading numeric characters and underscores from field names.

(Create additional field extraction stanzas for other hosts. or your own custom application directory in $SPLUNK_HOME/etc/apps/. The transform can also define a source key and/or event value formatting. see the information on the DELIMS attribute. and source types that refer to the same transform if necessary. which means that you don't have to specify FORMAT for simple field extraction cases. Edit the transforms. create a field extraction stanza that is linked to the host. Restart Splunk for your changes to take effect. ♦ Name-capturing groups in the REGEX are extracted directly to fields. source. for more information.conf. Note: Do not edit files in $SPLUNK_HOME/etc/system/default/. the following are equivalent: Using FORMAT: REGEX = ([a-z]+)=([a-z]+) 45 .conf.conf file in $SPLUNK_HOME/etc/system/local/. ♦ If the REGEX extracts both the field name and its corresponding value. define a field transform Follow this format when defining a search-time field transform in transforms. It is required for all search-time field transforms unless you are setting up a delimiter-based transaction. sources. you can use the following special capturing groups to skip specifying the mapping in FORMAT: <_KEY_><string>. In props. in which case you use DELIMS instead. or source type that you identified in step 1. 5. Note: Do not edit files in $SPLUNK_HOME/etc/system/default/. below.conf: [<unique_stanza_name>] REGEX = <regular expression> SOURCE_KEY = <string> FORMAT = <string> DELIMS = <quoted string list> FIELDS = <quoted string list> MV_ADD = <bool> CLEAN_KEYS = <bool> • The <unique_stanza_name> is required for all search-time transforms. 6.conf that utilizes this regex (or delimiter configuration).) 4.) Edit the props. <_VAL_><string>. • For example.conf file in $SPLUNK_HOME/etc/system/local/. • REGEX is a regular expression that operates on your data to extract fields. or your own custom application directory in $SPLUNK_HOME/etc/apps/. Create a field transform in transforms.won't require a regex. Add a reference to the transform you defined in transforms. First.

The first set of quoted delimiters separates the field/value pairs. tab spaces. ♦ For search-time transforms. ♦ For search-time transforms. ♦ If the event contains full delimiter-separated field/value pairs. to separate the values. Use it in place of REGEX when dealing with delimiter-based field extractions. and so on. Use it to specify the format of the field/value pair(s) that you are extracting. you enter two sets of quoted delimiters for DELIMS. line breaks. SOURCE_KEY is set to _raw. Then you use the FIELDS attribute to apply field names to the extracted values (see FIELDS below). ♦ If the events only contain delimiter-separated values (no field names). Alternately. Use it to identify a field whose values the transform regex should be applied to. including any field names or values you want to add. ♦ By default. You don't need to specify the FORMAT if you have a simple REGEX with name-capturing groups. FORMAT = $1::$2 $4::$3 • DELIMS is optional. FORMAT = first::$1 second::$2 third::other-value 2. ♦ This example of DELIMS usage applies to an event where field/value pairs are separated by '|' symbols. ♦ Defaults to an empty string. • FORMAT is optional.FORMAT = $1::$2 Not using FORMAT: REGEX = (?<_KEY_1>[a-z]+)=(?<_VAL_1>[a-z]+) • SOURCE_KEY is optional. The second set of quoted delimiters separates the field name from its corresponding value. ♦ Splunk consumes consecutive delimiter characters unless you specify a list of field names. Splunk reads even tokens as field names and odd tokens as field values. spaces. the key can be any field that is present at the time that the field transform is executed. "=" 46 . ♦ Delimiters must be quoted with " " (use \ to escape). you use one set of quoted delimiters. this is the pattern for the FORMAT field: FORMAT = <field-name>::<field-value>( <field-name>::<field-value>)* where: field-name = <string>|$<extracting-group-number> field-value = <string>|$<extracting-group-number> Examples of search-time FORMAT usage: 1. and the field names are separated from their corresponding values by '=' symbols: [pipe_eq] DELIMS = "|". colons. where field values--or field/value pairs--are separated by delimiters such as commas. which means it is applied to the entire event.

♦ If a particular class is specified for a source and a sourcetype. ♦ Here's an example of a delimiter-based extraction where three field values appear in an event. where <host> is the host for an event. They are separated by a comma and then a space. ♦ Similarly. Use FIELDS to provide field names for the extracted field values. 47 . the class for source wins out. field3 • MV_ADD is optional. see the example later in this topic. CLEAN_KEYS is always set to true for transforms. (For more information. Splunk takes the configuration from the highest precedence configuration block.conf. ♦ Add CLEAN_KEYS = false to your transform if you need to extract field names (keys) with leading underscores and/or 0-9 characters. ♦ source::<source>./default/. for more information). " FIELDS = field1.• FIELDS is used in conjunction with DELIMS when you are performing delimiter-based field extraction. ♦ host::<host>. [commalist] DELIMS = ". use \). but you only have field values to extract. Splunk makes any field that is used more than once in an event (but with different values) a multivalued field and appends each value it finds for that field. Splunk keeps the first value found for a field in an event and discards every subsequent value found for that same field in that same event. it overrides that class in . It controls whether or not the system strips leading underscores and 0-9 characters from the field names it extracts (see the subtopic "Use proper field name syntax. if a particular class is specified in . • CLEAN_KEYS is optional. separated by commas. in list format according to the order in which the values are extracted. Precedence rules for classes: ♦ For each class. You can associate multiple field transform stanzas to a single field extraction by listing them after the initial <unique_transform_stanza_name>. field2. <unique_transform_stanza_name> is the name of the field transform stanza that you are associating with the field extraction./local/ for a <spec>. ♦ Note: If field names contain spaces or commas they must be quoted with " " (to escape. Second. • <class> is the extraction class. When MV_ADD = true. the source type of an event.) [<spec>] REPORT-<value> = <unique_transform_stanza_name> • <spec> can be: ♦ <sourcetype>. Use it when you have events that repeat the same field but with different values. ♦ When set to false. where <source> is the source for an event. ♦ By default.." above.. configure a field extraction and associate it with the field transform Follow this format when you're associating a search-time field transform with a field extraction stanza in props.

control it by rearranging the list. Examples of custom search-time field extractions using field transforms These examples present custom field extraction use cases that require you to configure one or more field transform stanzas in transforms.• <unique_transform_stanza_name> is the name of your field transform stanza from transforms.1] [headerName=Host] [headerValue=www. The logs often come in this format: [fieldName1=fieldValue1] [fieldName2=fieldValue2] However.1. While the fields vary from event to event. in which case the format looks like: [headerName=fieldName1] [headerValue=fieldValue1]. • Transforms are applied in the specified order. Note: Index-time field transactions use TRANSFORM-<value> = <unique_transform_stanza_name>. In these secondary cases you still want to pull out the field names and values so that the search results are fieldName1=fieldValue1 fieldName2=fieldValue2 and so on. the pairs always appear in one of two formats. • <value> is any value you want to give to your stanza to identify its name-space. [headerName=User-Agent] 48 . Configuring a field extraction that utilizes multiple field transforms This example of search-time field transform setup demonstrates how: • you can create transforms that pull varying field name/value pairs from events. see "Configure index-time field extractions" in the Admin Manual. Let's say you have logs that contain multiple field name/field value pairs. logging multiple name/value pairs as a list. here's an example of an HTTP request event that combines both of the above formats. [method=GET] [IP=10.example.conf field extraction stanza. To make things more clear. • you can create a field extraction that references two or more field transforms. and that each fieldName is matched with a corresponding fieldValue.conf and then reference them in a props.conf. • If you need to change the order.1.com]. For more information. at times they are more complicated. [headerName=fieldName2] [headerValue=fieldValue2] Note that the list items are separated by commas.

This setting in FORMAT enables Splunk to keep matching the regex against a matching event until every matching field/value combination is extracted. myplaintransform Note that.[headerValue=Mozilla].example. Finally. One regex will identify events with the the first format and pull out all of the matching field/value pairs. You then create two unique transforms in transforms.conf--one for each regex--and then unite them in the corresponding field extraction stanza in props. you'll want to design two different regexes that are optimized for each format.\s\[headerValue=([^\]]+)\] FORMAT= $1::$2 Both transforms use the <fieldName>::<fieldValue> FORMAT to match each field name in the event with its corresponding value.com User-Agent=Mozilla Connection=close byteCount=255 Solution To efficiently and reliably pull out both formats of field/value pairs. [myplaintransform] REGEX=\[(?!(?:headerName|headerValue))([^\s\=]+)\=([^\]]+)\] FORMAT=$1::$2 The second transform (also added to transforms.conf. the field extraction stanza also sets KV_MODE=none. The other regex will identify events with the other format and pull out those field/value pairs.conf) catches the slightly more complex [headerName=fieldName1] [headerValue=fieldValue1]. which you create in props.1. [headerName=Connection] [headerValue=close] [byteCount=255] You want to develop a single field extraction that would pull the following field/value pairs from that event: method=GET IP=10.1. this field extraction stanza. [headerName=fieldName2] [headerValue=fieldValue2] case: [mytransform] REGEX= \[headerName\=(\w+)\]. It ensures that these new regexes aren't 49 .conf catches the fairly conventional <code>[fieldName1=fieldValue1] [fieldName2=fieldValue2]</code> case. The first transform you add to transforms.1 Host=www. references both of the field transforms: [mysourcetype] KV_MODE=none REPORT-a = mytransform.conf. besides using multiple field transforms. This disables automatic field/value extraction for the identified source type (while letting your manually defined extractions continue).

*) EXTRACT-ThreadName = ThreadName:\t(?.overridden by automatic field extraction.7 [BASE 7.55] ServerName: sfeserv36Node01Cell\sfeserv36Node01\server1 TimeStamp: 2010-04-27 09:15:57.0.WsServerImpl ClassName: MethodName: Manufacturer: IBM Product: WebSphere Version: Platform 7.671000000 UnitOfWork: Severity: 3 Category: AUDIT PrimaryMessage: WSVR0001I: Server server1 open for e-business ExtendedMessage: Now you could set up a bulky.}([\r\n]+) SHOULD_LINEMERGE = false EXTRACT-ComponentId = ComponentId:\t(?. tab spaces.*) EXTRACT-ClassName = ClassName:\t(?.0. and it also helps increase your search performance. For example.*) EXTRACT-SourceId = SourceId:\t(?.*) EXTRACT-Severity = Severity:\t(?. say you have a recurring multiline event where a different field/value pair sits on a separate line.) Configuring delimiter-based field extraction You can use the DELIMS attribute in field transforms to configure field extractions for events where field values or field/value pairs are separated by delimiters such as commas.*) EXTRACT-PrimaryMessage = PrimaryMessage:\t(?.ibm. Is there a more elegant way to handle it that would remove the need for all these EXTRACT lines? Yes! 50 .7 cf070942.*) But that solution is pretty over-the-top.*) EXTRACT-TimeStamp = TimeStamp:\t(?.*) EXTRACT-Manufacturer = Manufacturer:\t(?.0.*) EXTRACT-Version = Version:\t(?. Here's a sample event: ComponentId: Application Server ProcessId: 5316 ThreadId: 00000000 ThreadName: P=901265:O=0:CT SourceId: com. wordy search-time field extraction stanza in props.*) EXTRACT-UnitOfWork = UnitOfWork:\t(?.runtime. colons. and each pair is separated by a colon followed by a tab space.*) EXTRACT-Category = Category:\t(?.*) EXTRACT-ExtendedMessage = ExtendedMessage:\t(?.0.conf that handles all of these fields: [activityLog] LINE_BREAKER = [-]{8.*) EXTRACT-ThreadId = ThreadId:\t(?.*) EXTRACT-Product = Product:\t(?. (See the following subsection for more on disabling key/value extraction.*) EXTRACT-ProcessId = ProcessId:\t(?.ws.*) EXTRACT-ServerName = ServerName:\t(?.*) EXTRACT-MethodName = MethodName:\t(?. and more.

Configure the following stanza in transforms. Say you have a set of events that look like this: event1.}([\r\n]+) SHOULD_LINEMERGE = false REPORT-activity = activity_report These two brief configurations will extract the same set of fields as before. Splunk only extracts the first occurrence of a field in an event. But when MV_ADD is set to true in transforms.conf files to enable it: First.conf stanza mentioned above as: [activitylog] LINE_BREAKER = [-]{8. Splunk treats the field like a multivalue field and saves extracts each unique field/value pair in the event. and then specifies that the field name and field value on each line is separated by a colon and tab space (":\t").epochtime=1282182111 type=type1 value=value1 type=type3 value=value3 event2.conf for your sourcetype or source. Ordinarily. every subsequent occurrence is discarded.epochtime=1282182111 type=type2 value=value4 type=type3 value=value5 type=type4 value=va See how the type and value fields are repeated several times in each event? What you'd like to do is search type=type3 and have both of these events be returned. in props. Or you'd like to run a count(type) report on these two events that returns 5. transforms. Here's how you would set up your transforms. rewrite the wordy props. So.conf: [mv-type] REGEX = type=(?<type>\s+) MV_ADD = true Then.conf. what you want to do is create a custom multivalue extraction of the type field for these events.conf: [activity_report] DELIMS = "\n".conf and props. To complete this configuration. Handling events with multivalued fields You can use the MV_ADD attribute to extract fields in situations where the same field is used more than once in an event. but they leave less room for error and are more flexible. but has a different value each time. set: REPORT=type = mv-type 51 . ":\t" This states that the field/value pairs in the event are on separate lines ("\n").

source types.an event source type. Edit fields. or your own custom application directory in $SPLUNK_HOME/etc/apps/. where <host> is the host for an event. see Regular-Expressions." for example). where <source> is the source for an event. if one exists. [<spec>] KV_MODE = none <spec> can be: • <sourcetype> . or hosts through edits in props. and enables you to process the values in the search pipeline. and possibly a third time for the list of Cc addresses. mvexpand. Splunk also maintains a list of useful third-party tools for writing and testing regular expressions. or hosts You can disable automatic search-time field extraction for specific sources. they lose meaning that they might otherwise have if they're identified separately as "From". For more information on configuration files in general.Disabling automatic search-time extraction for specific sources. and "Cc". Configure multivalue fields Configure multivalue fields Multivalue fields are fields that can appear multiple times in an event and have a different value for each appearance. For more information on these and other commands see the topic on multivalue fields in the User manual. • source::<source>. source type. another time for the list of recipients. which typically appears two to three times in a single sendmail event--once for the sender. TOKENIZER uses a regular expression to tell Splunk how to recognize and extract multiple field values for a recurring field in an event.info. For a primer on regular expression syntax and usage. If all of these fields are labeled identically (as "AddressList. Search commands that work with multivalue fields include makemv. • host::<host>. Note: Custom field extractions set up manually via the configuration files or Manager will still be processed for the affected source. or host when KV_MODE = none. see "About configuration files" in the Admin manual.conf.conf.conf in $SPLUNK_HOME/etc/system/local/. 52 . and the Search Reference manual. You can test regexes by using them in searches with the rex search command. source types. Add KV_MODE = none for the appropriate [<spec>] in props. and nomv. Splunk parses multivalue fields at search time. One of the more common examples of multivalue fields is that of email address fields. Use the TOKENIZER key to configure multivalue fields in fields.conf. mvcombine. "To".

See the fields.conf Define a multivalue field by adding a stanza for it in fields. Note: Tokenization of indexed fields (fields extracted at index time) is not supported. If you have set INDEXED=true for a field. and CC into multiple values.\-]*\w) [From] TOKENIZER = (\w[\w\.conf. timeline.conf topic in the Admin manual for more information.example break email fields To.\-]*@[\w\.conf to break an indexed field into multiple values. Example The following examples from $SPLUNK_HOME/etc/system/README/fields. • TOKENIZER defaults to empty. You can use a search-time extraction defined in props. [To] TOKENIZER = (\w[\w\. • Otherwise the first group is taken from each match to form the set of field values. Then add a line with the TOKENIZER key and a corresponding regular expression that shows how the field can have multiple values. It also provides the summary and XML outputs of the asynchronous search API.conf.\-]*\w) [Cc] TOKENIZER = (\w[\w\.conf and transforms.\-]*@[\w\.\-]*\w) 53 . Note: If you have other attributes to set for a multivalue field. When TOKENIZER is empty.\-]*@[\w\. set them in the same stanza underneath the TOKENIZER line.Configure a multivalue field via fields. [<field name 1>] TOKENIZER = <regular expression> [<field name 2>] TOKENIZER = <regular expression> • <regular expression> should indicate how the field in question can take on multiple values. From. • The TOKENIZER key is used by the where. and stats commands. the field can only take on a single value. you cannot also use the TOKENIZER key for that field.

conf.-#$%&+./:=?@\\'|*\n\r\"(){}<>[]^!" • The punct field is not available for events in the _audit index because those events are signed using PKI at the time they are generated. The punct field helps you narrow down searches based on the structure of the event. Use the punct field to search on similar events Because the format of an event is often unique to an event type. keep in mind: • Quotes and backslashes are escaped. they're checked against known event types. Event types let you classify events that have common characteristics. • Spaces are replaced with an underscore (_). 54 . Event type classification There are several ways to create your own event types. This field is useful for finding similar events quickly. The punct field stores the first 30 punctuation characters in the first line of the event. Event types let you sift through huge amounts of data. An event type is a user-defined field that simplifies search by letting you categorize events. and create alerts and reports. • Dashes that follow alphanumeric characters are ignored. Splunk indexes the punctuation characters of events as a field called punct. When you use punct. When saving a search as an event type. Events versus event types An event is a single record of activity within a log file. Tag or save event types after indexing your data.Data classification: Event types and transactions About event types About event types Event types are a categorization system to help you make sense of your data. or you can save any search as an event type.. find similar patterns. When your search results come back. An event typically includes a timestamp and provides information about what occurred on the system being monitored or logged. An event type is applied to an event at search time if that event matches the event type definition in eventtypes. Define event types via Splunk Web or through configuration files. • Tabs are replaced with a "t". • Interesting punctuation characters are: ". you may want to use the punct field to craft your searches.

For more about saving searches as event types..1" 200 2953 Produces this punctuation: .[01/Jul/2005:12:05:27 -0700] "GET /trade/app?action=logout HTTP/1.34. For more information.223 . Create new event types The simplest way to create a new event type is through Splunk Web.\"__ Event type discovery Pipe any search to the typelearner command and create event types directly from Splunk Web. although you can still specify terms to ignore when learning new event types in Splunk Web. For more information about event type tagging. see "Define and maintain event types in Splunk Web" in this manual. Create new event types by modifying eventtypes._-_-_[:::_-]_\"_?=_/.conf. The file eventdiscoverer.. see the "Tag event types" topic in this manual 55 . Punct examples This event: ####<Jun 3.conf is mostly deprecated. see the "Classify and group similar events" topic in the User manual. There can be multiple tags per event. Save an event type much in the same way you save a search. see "Classify and group similar events" topic in the User manual.26. 2005 5:38:22 PM MDT> <Notice> <WebLogicServer> <bea03> <asiAdminServer> <WrapperStartStopAppMain> <>WLS Kernel<> <> <BEA-000360> <Server started in RUNNING mode> Produces this punctuation: ####<_.__::__>_<>_<>_<>_<>_<>_ This event: 172.For an introduction to the punct field and other methods of event classification.. Event type tags Tag event types to organize your data into categories.

Configuration files for event types Event types are stored in eventtypes. • Click Save..conf in $SPLUNK_HOME/etc/users/<your-username>/<app>/local/..conf. A single event can match multiple event types.. • Select the Actions. Splunk moves the event type to $SPLUNK_HOME/etc/apps/<App>/local/.. Define and maintain event types in Splunk Web Define and maintain event types in Splunk Web Any search that does not involve a pipe operator or a subsearch can be saved as an event type. dropdown and click Save as event type. • Build event types: The Build Event Type utility enables you to dynamically create event types based on events returned by searches. where <app> is the app you were in when you created the event type. You can now use your event type in searches. Any event types you create through Splunk Web are automatically added to eventtypes. Terms for event type discovery are set in eventdiscoverer. The Save Event Type dialog box pops up. If you named your event type foo. • Optionally add one or more tags for the event type. If you change the permissions on the event type to make it available to all users (either in the app. • Name the event type. pre-populated with your search terms. you'd use it in a search like this: eventtype=foo Automatically find and build event types Unsure whether you have any interesting event types in your IT data? Splunk provides utilities that dynamically and intelligently locate and create useful event types: • Find event types: The findtypes search command analyzes a given set of events and identifies common patterns that could be turned into potentially useful event types. Save a search as an event type To save a search as an event type: • Enter the search and run it. 56 . comma-separated. or globally to all apps).conf.

. findtypes returns the top 10 potential event types found in the sample. 57 . add this to the end of your search: . in terms of the number of events that match each kind of event discovered. review the results it returns to determine whether or not it is capturing the specific information you want. You can increase this number by adding a max argument: findtypes max=30 Splunk also indicates whether or not the event groupings discovered with findtypes have already been associated with other event types. Note: The findtypes command analyzes 5000 events at most to return these results.| head 1000 | findtypes Test potential searches before saving them as event types When you identify a potentially useful event grouping. By default. Click Test for the event grouping in which you are interested in to see its associated search run in a separate window. This helps you easily identify kinds of events that are subsets of larger event groupings. After the search runs. They are: • hierarchically ordered in terms of "coverage" (frequency). You can lower this number using the head command for a more efficient search: .. • coupled with searches that can be used as the basis for event types that will help you locate similar events.Find event types To use the event type finder... test the search associated with it to see if it returns the results you want.| findtypes Searches that use the findtypes command return a breakdown of the most common groups of events found in the search results.

Enter a name for the event type. As you select other field/value pairs in the Event type features sidebar. Adding an event type in Manager To add an event type through Manager. Build event types If you find an event in your search results that you'd like to base an event type on. The list of sample events updates as well. separated by commas. This is the search that the event type you're building will be based upon. If you want to edit the event type search directly. save it as an event type by clicking Save for the event grouping with which it is associated. and then create an event type based on that search. the Generated event type updates to include those selections. In the Event type features sidebar. This brings up the Edit Event Type dialog. You can also add new event types through the Event Types page. and optionally identify one or more tags that should be associated with it. The Save Event Type dialog appears. You can use this utility to design a search that returns a select set of events. Save a tested search as an event type If you test a search and it looks like it's returning the correct set of events. to reflect the kinds of events that the newly modified event type search would return. Add and maintain event types in Manager The Event Types page in Manager enables you to view and maintain details of the event types that you have created or which you have permission to edit.Save a tested search as an event type When you find a search that returns the right collection of results. You can also edit the search if necessary. open the dropdown event menu (find the down arrow next to the event timestamp) and click Build event type. Splunk takes you to the Add New event types page. click Edit. 58 . You can also edit the search if necessary. you'll find possible field/value pairings that you can use to narrow down the event type search further. separated by commas. Event types displayed on the Event Types page may be available globally (system-wide) or they may apply to specific Apps. you can click Save to save it as an event type. The Save Event Type dialog appears. The Build Event Type utility finds a set of sample events that are similar to the one you selected from your search results. Enter a name for the event type. Test potential searches before saving them as event types When you build a search that you think might be a useful event type. and optionally identify one or more tags that should be associated with it. The Build Event Type utility also displays a search string under Generated event type at the top of the page. Click Test to see the search run in a separate window. Splunk takes you to the Build Event Type utility. navigate to the Event Types page and click New. test it. which you can use to edit the search string.

see "About tags and aliases" in this manual. Maintaining event types in Manager To update the details of an event type. where 1 is the highest priority and 10 is the lowest. You use the Priority setting to ensure that certain event types take precedence over others in this display order. and the Search string that ultimately defines the event type (see "Save a search as an event". For example. if you have edit permissions for them. and change the This app only selection to All apps. In a situation like this. you could easily have a set of events that are part of a wide-ranging system_error event type. To make a particular event type available to all users on a global basis. Note: All event types are initially created for a specific App. Name. if you have the permissions to do so. The Priority setting is important for common situations where you have events that fit two or more event types. Within that large set of events. the critical_disc_error event type is always listed ahead of the system_error event type. You can optionally include Tags for the event type. when events that match both system_error and critical_disc_error appear in search results. and click its name. locate it in the list on the Event Types page in Manager. and Priority for the event type. while giving the other two error codes Priority values in the 1 to 5 range. you could give the system_error event type a Priority of 10. You can also optionally select a Priority for the event type. You can also update permissions for event types and delete event types through the Event Types page. Tags. you could have events that also belong to more precisely focused event types like critical_disc_error and bad_external_resource_error. For more information about tagging event types and other kinds of Splunk knowledge. 59 . you have to locate the event type on the Event Types page. click its Permissions link. When the event turns up in search results. you may want to give the precisely focused event types a higher priority. where you can edit the Search string. Splunk takes you to the details page for the event type. or event types that are subsets of larger ones. Splunk displays the event types associated with the event in a specific order.From this page you enter the new event type's Destination App. above). If you have a number of overlapping event types. This way.

conf. Any event types you create through Splunk Web are automatically added to $SPLUNK_HOME/etc/system/local/eventtypes. For example.Configure event types directly in eventtypes. an event type with the header [cisco-%code%] that has code=432 becomes labeled [cisco-432].conf.conf. There are a few default event types defined in $SPLUNK_HOME/etc/system/default/eventtypes.example as an example. Note: If the name of the event type includes field names surrounded by the percent character (e. description = <string> 60 .conf. search = <string> • Search terms for this event type. disabled = <1 or 0> • Toggle event type on or off. ♦ You can have any number of event types.g. see "About configuration files" in the Admin manual. or your own custom application directory in $SPLUNK_HOME/etc/apps/. Edit eventtypes. or create your own eventtypes. For more information on configuration files in general.conf in $SPLUNK_HOME/etc/system/local/.conf Configure event types directly in eventtypes. [$EVENTTYPE] • Header for the event type • $EVENTTYPE is the name of your event type. each represented by a stanza and any number of the following attribute/value pairs.conf. • For example: error OR warn. tags = <string> • Space separated words that are used to tag an event type.conf You can add new event types and update existing event types by configuring eventtypes. • Set to 1 to disable.conf. Use $SPLUNK_HOME/etc/system/README/eventtypes. %$FIELD%) then the value of $FIELD is substituted at search time into the event type name for that event. Configuration Make changes to event types in eventtypes.

conf in $SPLUNK_HOME/etc/system/local/. and the other is called fatal. So if you want to disable the web event type. Define event type templates in eventtypes.• Optional human-readable description of the event type.conf. see "About configuration files" in the Admin manual. Note: You can tag eventtype field values the same way you tag any other field/value combination. For more information on configuration files in general. Example Here are two event types. 1 is the highest. See the tags. [web] search = html OR http OR https OR css OR htm OR html OR shtml OR xls OR cgi [fatal] search = FATAL Disable event types Disable an event type by adding disabled = 1 to the event type stanza eventtypes. add the following entry to its stanza: [web] disabled = 1 Configure event type templates Configure event type templates Event type templates create event types at search time.conf spec file for more information. one is called web.conf: [$EVENTTYPE] disabled = 1 $EVENTTYPE is the name of the event type you wish to disable. or your own custom application directory in $SPLUNK_HOME/etc/apps/. Edit eventtypes. 61 . and 10 is the lowest. priority = <integer> • Splunk uses this value to determine the order in which it displays matching event types for an event.

All of this data represents a single user transaction. Splunk creates an event type titled "cisco-432". For example. 62 . transaction ID. Here are some other examples of transactions: • Web access events • Application server events • Business transactions • E-mails • Security violations • System failures Transaction search Transaction search is useful for a single observation of any physical event stretching over multiple logged events. Splunk creates an event type titled $NAME-bar for that event. and the fulfillment application may log the message ID along with the shipping status. saved as a field in Splunk.Event type template configuration Event type templates use a field name surrounded by percent characters to create event types at search time where the %$FIELD% value is substituted into the name of the event type. the transaction ID may live in the message queue with a message ID. a customer shopping in an online store could generate a transaction across multiple sources. To learn more. Web access events might share a session ID with the event in the application server log. Use the transaction command to define a transaction or override transaction options specified in transactiontypes. A transaction type is a configured transaction.conf. the application server log might contain the account ID. and product ID. read "Search for transactions" in this manual. [$NAME-%$FIELD%] $SEARCH_QUERY So if the search query in the template returns an event where %$FIELD%=bar. Example [cisco-%code%] search = cisco If a search on "cisco" returns an event that has code=432. About transactions About transactions A transaction is any group of conceptually related events that spans time. Any number of data sources can generate transactions over multiple log entries.

conf). Transactions also have additional data that is stored in the fields: duration and transactiontype. • transactiontype is the name of the transaction (as defined in transactiontypes. use the stats command.conf by the transaction's stanza name). read "Define transactions" in this manual. To use transaction. If you want to compute aggregate statistics over transactions that are defined by data in a single field. You can save transactions by editing transactiontypes. When to use stats instead of transactions Transactions aren't the most efficient method to compute aggregate statistics on transactional data.conf. • duration contains the duration of the transaction (the difference between the timestamps of the first and last events of the transaction).Configure transaction types You may want to persist the transaction search you've created. or define transaction constraints in your search by setting the search options of the transaction command. if you wanted to compute the statistics of the duration of a transaction defined by the field session_id: * | stats min(_time) AS earliest max(_time) AS latest by session_id | eval duration=latest-earliest | stats min(duration) max(duration) avg(duration) median(duration) perc95(duration) Similary. The transaction command yields groupings of events which can be used in reports. For example. either call a transaction type (that you configured via transactiontypes. if you wanted to compute the number of hits per clientip in an access log: sourcetype=access_combined | stats count by clientip | sort -count Also. the shared event types. and the field values. Or you might want to create a lasting transaction type. To learn more about configuring transaction types. 63 . Search for transactions Search for transactions Search for transactions using the transaction search command either in Splunk Web or at the CLI. if you wanted to compute the number of distinct session (parameterized by cookie) per clientip in an access log: sourcetype=access_combined | stats dc(cookie) as sessions by clientip | sort -sessions Read the stats command reference for more information about using the search command. Define transactions by creating a stanza and listing specifications. Search options Transactions returned at search time consist of the raw text of each event.

• Can be in seconds..You can add transaction to any search.|transaction host.|transaction host. ♦ A search result that has no host value can be in a transaction with a result that has host=mylaptop. • Events with common field names and different values will not be grouped. then a search result that has host=mylaptop can never be in the same transaction as a search result with host=myserver. hours or days. [field-list] • This is a comma-separated list of fields. marks the beginning of a new transaction. endswith=<transam-filter-string> 64 . Follow the transaction command with the following options. minutes. • For example: ♦ startswith="login" ♦ startswith=(username=foobar) ♦ startswith=eval(speed_field < max_speed_field) ♦ startswith=eval(speed_field < max_speed_field/12) • Defaults to "".. • Defaults to maxpause=-1.. • Requires there be no pause between the events within the transaction greater than maxpause. match=closest • Specify the matching type to use with a transaction definition. maxpause=[<integer> s|m|h|d] • Specifies the maximum pause between transactions. for an "all time" timerange. ♦ For example: 5s. craft your search and then pipe it to the transaction command. ♦ For example.cookie • If set. • The only value supported currently is closest. • If the value is negative. the maxspause constraint is disabled. if you add . Note: Some transaction options do not work in conjunction with others. For best search performance. such as . if satisfied by an event. 12h or 30d. • Defaults to maxspan=-1. maxspan=[<integer> s|m|h|d] • Set the maximum pause between the events in a transaction.. 6m. each event must have the same field(s) to be considered part of the same transaction. startswith=<string> • A search or eval-filtering expression which. For more information see the topic on the transaction command in the Search Reference manual.

This search takes events from the access logs. • <eval-expression> is a valid eval expression that evaluates to a boolean. For an example of how to use macro searches and transactions.• A search or eval-filtering expression which. • For example: ♦ endswith="logout" ♦ endswith=(username=foobar) ♦ endswith=eval(speed_field < max_speed_field) ♦ endswith=eval(speed_field < max_speed_field/12) • Defaults to "". For startswith and endswith. 65 . sourcetype=access_combined | transaction clientip maxpause=5m maxspan=3h Define transactions Define transactions Any series of events can be turned into a transaction type. see "Create and use search macros" in the User manual. • <quoted-search-expression> is a valid search expression that contains quotes. <transam-filter-string> is defined with the following syntax: "<search-expression>" | (<quoted-search-expression>) | eval(<eval-expression> • <search-expression> is a valid search expression that does not contain quotes. For more information about macro searches. You can create transaction types via transactiontypes.conf. and creates a transaction from events that share the same clientip value that occurred within 5 minutes of each other (within a 3 hour time span). Read more about use cases in "About transactions". Examples: • search expression: (name="foo bar") • search expression: "user=mildred" • search expression: ("search literal") • eval bool expression: eval(distance/time < max_speed) Transactions and macro search Transactions and macro searches are a powerful combination that allow substitution into your transaction searches. in this manual. Example transaction search Run a search that groups together all of the web pages a single user (or client IP address) looked at over a time range. see "Design macro searches" in this manual. if satisfied by an event. marks the end of a transaction. Make a transaction search and then save it with $field$ to allow substitution. See below for configuration details.

♦ For example: 5s. fields = <comma-separated list of fields> • If set. • Can be in seconds.conf 1. hours or days. • If you do not specify an entry for each of the following attributes. or 'exclusive' to a single transaction. • Defaults to "". to search for the transaction in Splunk Web. hours or days. or your own custom application directory in $SPLUNK_HOME/etc/apps/. • Can be in seconds. Define transactions by creating a stanza and listing specifications for each transaction within its stanza. 12h or 30d. 66 . • Use the stanza name. exclusive = <true | false> • Toggle whether events can be in multiple transactions. Configure transaction types in transactiontypes. ♦ For example: 5s. [<TRANSACTIONTYPE>]. 6m. see "About configuration files" in the Admin manual. Create a transactiontypes. • Defaults to 2s.For more information on configuration files in general. maxpause = [<integer> s|m|h|d] • Set the maximum pause between the events in a transaction. 6m. Use the following attributes: [<transactiontype>] maxspan = [<integer> s|m|h|d] maxpause = [<integer> s|m|h|d] fields = <comma-separated list of fields> exclusive = <true | false> match = closest [<TRANSACTIONTYPE>] • Create any number of transaction types. minutes. 12h or 30d. each represented by a stanza name and any number of the following attribute/value pairs.conf file in $SPLUNK_HOME/etc/system/local/. maxspan = [<integer> s|m|h|d] • Set the maximum time span for the transaction. each event must have the same field(s) to be considered part of the same transaction. Splunk uses the default value. • Defaults to 5m. 2. minutes. • Applies to 'fields' (above).

match = closest • Specify the match type to use. Use the transaction command in Splunk Web to call your defined transaction (by its transaction type name)." • Defaults to "closest. the only value supported is "closest.• For example. You can override configuration specifics during search. 67 . if fields=url. • Currently. • Defaults to "true". • Setting exclusive = false causes the matcher to look for multiple matches for each event and approximately doubles the processing time. For more information about searching for transactions. but not a 'url' value could be in multiple transactions that share the same 'cookie'. see "Search for transactions" in this manual. then an event with a 'cookie'. and exclusive=false. but have different URLs." 3.cookie.

• Define a field lookup that is based on an external Python script rather than a lookup table. • Apply only to events belonging to a specific event type or group of event types. For example. which. you could design one that sends a status value to an external issue-tracking 68 . A really simple workflow action would be one that is associated with a IP_address field. • Are accessed either via event dropdown menus. enabling you to pass information to an external web resource. For example. and then adds that definition to the event as the value of a new status_description field. such as a search engine or IP lookup service. Workflow actions Workflow actions enable you to set up interactions between specific fields in your data and other applications or web resources. when launched.Data enrichment: Lookups and workflow actions About lookups and workflow actions About lookups and workflow actions Lookups and workflow actions enable you to enrich and extend the usefulness of your event data through interactions with external resources. see "Lookup fields from external data sources. So if you have an event where http_status = 503 the lookup would add status_description = Service Unavailable. matches that value with its definition in a CSV file. you can: • Arrange to have a static lookup table be populated by the results of a saved search. Lookup tables Lookup tables use information in your events to determine how to add other fields from external data sources such as static tables (CSV files) and Python-based commands. A really basic example of this functionality would be a static lookup that takes the http_status value in an event. this could come in handy if you need to use DHCP logs to identify users on your network based on their IP address and the event timestamp. You can also set up workflow actions that: • Apply only to particular fields (as opposed to all fields in an event). • Create a time-based lookup. Server Error to that event. you could create a lookup that uses a Python script that returns an IP address when given a host name. opens an external WHOIS search in a separate browser window based on the IP_address value. It's also possible to create lookups that add fields based on time information." in this chapter. if you are working with a lookup table that includes a field value that represents time. Of course. • Perform HTTP GET requests. field dropdown menus. For more information. or both. there are more advanced ways to work with lookups. For example. and returns a host name when given an IP address. For example. • Perform HTTP POST requests that can send field values to an external resource.

For more information about using the Lookups Manager. If the file doesn't exist. To set up a lookup using the configuration files: Important: Do not edit conf files in $SPLUNK_HOME/etc/system/default. create it. You can also add fields based on matching time information. • Take certain field values from a chosen event and insert them into a secondary search that is populated with those field values and which launches in a secondary browser window. Edit transforms.conf and transforms. in this chapter. if you don't want to overwrite the output field) from the lookup table that you defined in transforms. This topic walks discusses how to use props. Use filename for static lookups and external_cmd for external lookups. In this configuration file. Look up fields from external data sources Look up fields from external data sources Use the dynamic fields lookup feature to add fields to your events with information from an external source. if you are monitoring logins with Splunk and have IP addresses and timestamps for those logins in your Splunk index. You can set up a lookup using the Lookups Manager page in Splunk Web or by configuring stanzas in props. you can use a dynamic field lookup to map the IP address and timestamp to the MAC address and username information for the matching IP and timestamp data that you have in your DHCP logs. Each column may have multiple instances of the same value (multi-valued fields). For information about setting workflow actions up in Manager. 2.conf. see the fields lookup tutorial in the User Manual.conf. Instead. you should edit the file in $SPLUNK_HOME/etc/system/local/ or $SPLUNK_HOME/etc/apps/<app_name>/local/. Currently you can define two kinds of lookup tables: static lookups (which utilize CSV files) and external lookups (which utilize Python scripts). 1.conf to define your lookup table.conf to set up your lookups. see "Create workflow actions in Splunk Web". The arguments you use in your transforms stanza indicate the type of lookup table you want to define. For example. Edit props. if you have multiple tables. Note: A lookup table must have at least two columns.conf to apply your lookup table. You can have more than one field lookup defined in a single source stanza. for example. Each lookup should have it's own unique lookup name.conf and transforms. you specify the fields to match and output (or outputnew. you can name them: 69 . This step is the same for both static and external lookups. such as a static table (CSV file) or an external (Python) command.application.

In props. you can select the fields to display in each of the matching search results. From there. 3. In transforms.conf to apply your lookup table. you can specify the number of matching entries to apply to an event.conf.conf. add a stanza with the lookup key. Set up a fields lookup based on a static file The simplest fields lookup is based on a static table. etc. the lookup is run automatically. By default. or something more descriptive. The CSV file needs to be located in one of two places: • $SPLUNK_HOME/etc/system/lookups/ • $SPLUNK_HOME/etc/apps/<app_name>/lookups/ Create the lookups directory if it does not exist.. as specified in props. max_matches is 100 for lookups that are not based on a timestamp field. add a stanza to define your lookup table.conf to define your lookup table.LOOKUP-table1. 2. You will use this transform in props. it will also impact the speed of your searches. max_matches indicates that the first (in file order) <integer> number of entries are used. Edit props. The name of the stanza is also the name of your lookup table.conf and indicates how Splunk should apply it to your events: [<stanza name>] lookup-<name> = $TRANSFORM <match_field_in_table> OUTPUT|OUTPUTNEW <output_field_in_table> • stanza name is the sourcetype. specifically a CSV file.conf where you defined your lookup table. This stanza specifies the lookup table that you defined in transforms.conf. host. When you add a lookup to props. 1. After restart. LOOKUP-table2. or source to which this lookup applies.conf. reference the CSV file's name: [myLookup] filename = <filename> max_matches = <integer> Optionally. If your automatic lookup is very slow. • $TRANSFORM references the stanza in transforms. Edit transforms. • stanza name can't use regex-type syntax. Restart Splunk to implement the changes you made to the configuration files. In this stanza. you should see the output fields from your lookup table listed in the fields picker.conf. 70 .

status_type 100.Use Proxy.Successful 206. If you're using this in the Search App.Non-Authoritative Information. <match_field4>. Restart Splunk. Use OUTPUTNEW if you don't want to overwrite existing values in your output field.Forbidden. you add the status description and status type fields into your events.status_description.Redirection 301.OK. you want to match the status field in your lookup table (http_status.Client Error 403.Client Error 404. • output_field_in_table is the column in the lookup table that you add to your events.Successful 203.Client Error 405.Redirection 302.Redirection 304. Then.Accepted.csv) with the field in your events.Found.Bad Request. You can put this into $SPLUNK_HOME/etc/apps/<app_name>/lookups/. • You can have multiple columns on either side of the lookup.Reset Content.Successful 201.• match_field_in_table is the column in the lookup table that you use to match values. For example. and so on. In this example.Created.Client Error 401.Informational 101.csv file.Not Found.Switching Protocols.See Other.Redirection 303.Moved Permanently.Redirection 305. Use the AS clause if the field names in the lookup table and your events do not match or if you want to rename the field in your event: [<stanza name>] lookup_<name> = $TRANSFORM <match_field_in_table> AS <match_field_in_event> OUTPUT|OUTPUTNEW <output_field_in_table> AS <output_field_in_event> You can have more than one field after the OUTPUT|OUTPUTNEW clause.Partial Content. You can also have one field return two fields. <match_field2> OUTPUT|OUTPUTNEW <match_field3>. three fields return one field.Unauthorized.Client Error 402.No Content. If you don't use OUTPUT|OUTPUTNEW.Successful 300.Redirection 400.Client Error 71 . put the file into $SPLUNK_HOME/etc/apps/search/lookups/: status. Splunk adds all the field names and values from the lookup table to your events.Redirection 307. The following is the http_status.Payment Required.Informational 200.Successful 205.Multiple Choices. 3.Continue. you could have $TRANSFORM <match_field1>.Temporary Redirect.Successful 204. Example of static fields lookup Here's an example of setting up lookups for HTTP status codes in an access_combined log.Successful 202.Method Not Allowed.Not Modified.

dest = <string> 72 . where the search returns a results table: 1.Request-URI Too Long.Requested Range Not Satisfiable.Client Error 412.Proxy Authentication Required.Client Error 417. you will see the fields status_description and status_type listed in your fields picker menu.Bad Gateway.Server Error 503.Unsupported Media Type. Restart Splunk. put: [access_combined] lookup_http = http_status status OUTPUT status_description.Request Entity Too Large. put: [http_status] filename = http_status.Not Implemented.Client Error 411.conf file located in either $SPLUNK_HOME/etc/system/local/ or $SPLUNK_HOME/etc/apps/<app_name>/local. status_type 3.Precondition Failed.conf to use the results of a saved search to populate a lookup table.Client Error 410.Not Acceptable.Server Error 501.Service Unavailable. Use search results to populate a lookup table You can edit a local or app-specific copy of savedsearches.Gateway Timeout. In a saved search stanza. action.Client Error 408.Conflict.populate_lookup = 1 This tells Splunk to save your results table into a CSV file. Add the following line to tell Splunk where to copy your lookup table.Client Error 413.Server Error 502. In a transforms. when you run a search that returns Web access information.Client Error 500.conf file. Add the following line to enable the lookup population action.Expectation Failed.csv 2. located in either $SPLUNK_HOME/etc/system/local/ or $SPLUNK_HOME/etc/apps/<app_name>/local/.Server Error 1.Client Error 407.Gone.Client Error 414. Now.HTTP Version Not Supported.Length Required.Client Error 409.Server Error 505.406.Client Error 415.Internal Server Error.Request Timeout.populate_lookup.Server Error 504. 2.Client Error 416. In a props. action.

This is also called a scripted or external lookup. $SPLUNK_HOME/etc/system/lookups or $SPLUNK_HOME/etc/<app_name>/lookups.populate_lookup.populate_lookup. we recommend that you set this to true for scheduled searches that populate lookup tables.dest = etc/system/lookups/myTable.dest value is a lookup name from transforms. you can set up your fields lookup the same way you set up a static lookup. For example. the reference must be relative to the directory where the script is located. run_on_startup = true If it does not run on startup. you might include: action. if you refer to any external resources (such as a file).csv The destination directory. Python scripts used for these lookups must be located in one of two places: • $SPLUNK_HOME/etc/apps/<app_name>/bin • $SPLUNK_HOME/etc/searchscripts Note: When writing your Python script. Set up a fields lookup based on an external command or script For dynamic or external lookups. 3. it will run at the next scheduled time. Add the following line if you want this search to run when Splunk starts up. Splunk only supports Python scripts for external lookups. If it is a path to a CSV file.conf or a path to a CSV file where Splunk should copy the search results. Note: Currently.The action. if you want to save the results to a global lookup table. Generally. which is a DNS lookup script that: 73 . Example of external fields lookup Here's an example of how you might use external lookups to match with information from a DNS server.conf stanza references the command or script and arguments to invoke. the path should be relative to $SPLUNK_HOME. delimited by a comma and space. You can also specify the type of command or script to invoke: [myLookup] external_cmd = <string> external_type = python fields_list = <string> max_matches = <integer> Use fields_list to list all the fields supported by the external command. your transforms. should already exist. Because Splunk copies the results of the saved search to a CSV file.py. Splunk ships with a script located in $SPLUNK_HOME/etc/system/bin/ called external_lookup.

• if given a host. "OUTPUT ip AS clientip" indicates that you want Splunk to add the values of ip from the lookup table into the clientip field in the events. The fields that you pass to this script are the ones you specify in transforms. In a transforms. So. In the DNS lookup example above. keep in mind that it needs to take in a partially empty CSV file and output a filled-in CSV file.py ip AS clientip OUTPUTNEW host AS hostname For this example. "host" and "ip".conf file. but Splunk automatically extracts the IP addresses from Web access logs into a field named clientip. which may look like this: host. For a reverse DNS lookup. In a props. you don't need to rename the field. When you run the search command: . put: [dnsLookup] external_cmd = external_lookup.py host ip Note: If you don't pass these arguments.. • if given an IP address. Since the host field has the same name in the lookup table and the events. returns the host name. instead of overwriting the host field value. returns the IP address.. the script will return an error. called hostname 3. the CSV file contains 2 fields. Restart Splunk. your props. More about the external lookup script When designing your external lookup script. 1.conf as [dnsLookup] and pass into the external command script the values for the "host" field as a CSV file. | lookup dnsLookup host You're telling Splunk to use the lookup table that you defined in transforms. put: [access_combined] lookup_dns = dnsLookup host OUTPUT ip AS clientip The field in the lookup table is named ip. you want Splunk to return the host value in a new field.py host ip fields_list = host.ip work. The arguments that you pass to the script are the headers for these input and output files.conf stanza would be: [access_combined] lookup_rdns = external_lookup.conf file. ip 2.com 74 .conf: external_cmd = external_lookup.

you can use this time field to set up your fields lookup. the first matching entry in descending order is applied.conf file.csv.conf: time_field = <field_name> time_format = <string> If time_field is present.0. The two headers are included because they are the fields you specified in the fields_list parameter of transforms.127.net Basically. add the following lines to your lookup stanza in transforms. For a match to occur with time-based lookups. To do this. by default max_matches is 1. The script then outputs the following CSV file and returns it to Splunk.home.net. which contains the timestamp. Use the time_format key to specify the strptime format of your time_field. dhcp. Example of time-based fields lookup Here's an example of how you might use DHCP logs to identify users on your network based on their IP address and the timestamp. For time-based (or temporal) lookups. 1. time_format is UTC. and the user's name and MAC address. you can also specify offsets for the minimum and maximum amounts of time that an event may be later than a lookup entry.com. In a transforms. IP address.127. put: 75 . By default.conf file. put: [dhcpLookup] filename = dhcp. In a props. Let's say the DHCP logs are in a file. add the following lines to your stanza: max_offset_secs = <integer> min_offset_secs = <integer> By default.ip work.0. but missing values for ip.1 home.0. there is no maximum offset and the minimum offset is 0. Also. this is a CSV file with the header "host" and "ip".2 Set up a time-based fields lookup If your static or external lookup table has a field value that represents time.0.csv time_field = timestamp time_format = %d/%m/%y %H:%M:%S 2. which populates the ip field in your results: host.conf.

You might instead rename them: [host::machine_name] LOOKUP-table = logs_per_day host OUTPUTNEW average_logs AS logs_per_day [sendmail] LOOKUP-location = location host OUTPUTNEW building AS location Now you have two different settings that won't collide.Using identical names in lookup stanzas Lookup table definitions are indicated with the attribute.conf overrides the others. or sourcetype should have different names. You may set this up on purpose. if have two lookups that share "table" as their: [host::machine_name] LOOKUP-table = logs_per_day host OUTPUTNEW average_logs AS logs_per_day [sendmail] LOOKUP-table = location host OUTPUTNEW building AS location Any events that overlap between these two lookups will only be affected by one of them. you can end up with a situation where only one of them seems to work at any given point in time. 76 . or sourcetype) the first lookup with that stanza in fields. In other words: • events that match the host will get the host lookup. LOOKUP-<name>. Troubleshooting lookups . All lookups with the same host. but in most cases it's probably not very convenient. When you do give the same name to two or more lookups you can run into trouble unless you know what you're trying to do: • If two or more lookups with the same name share the same stanza (the same host. For example. In this example. or sourcetypes) that share the same name. you're saying this is the lookup that achieves some purpose or action described by "table". sources. • events that match both will only get the host lookup.[dhcp] lookup_table = dhcpLookup ip mac OUTPUT user 3. When you name your lookup LOOKUP-table. and the other has something to do with location. source. In general it's best if all of your lookup stanzas have different names to reduce the chance of things going wrong. these lookups are intended to achieve different goals--one determines something about logs per day. • If you have lookups with different stanzas (different hosts. source. • events that match the sourcetype will get the sourcetype lookup. Restart Splunk.

In addition. which is where you define individual workflow actions. • Search workflow actions. • Launch secondary Splunk searches that use one or more field values from selected events. From there you can go to the Workflow actions page to review and update existing workflow actions. Define workflow actions using Splunk Manager You can set up all of the workflow actions described in the bulleted list at the top of this chapter and many more using Splunk Manager. If you're creating a new workflow action. which launch secondary searches that use specific field values from an event.Create workflow actions in Splunk Web Create workflow actions in Splunk Web Enable a wide variety of interactions between indexed fields and other web resources with workflow actions. You can also set them up to only appear in the menus of specific fields. • When selected. you need to give it a Name and identify its Destination app. you can define workflow actions that enable you to: • Perform an external WHOIS lookup based on an IP address found in an event. Or you can just click Add new for workflow actions to create a new one. • Appear either in field menus or event menus in search results. For example. go to the Manager page and click Fields. To begin. There are three kinds of workflow actions that you can set up: • GET workflow actions. • Use the field values in an HTTP error event to create a new entry in an external issue management system. 77 . or in all field menus in a qualifying event. This action type enables you to do things like create entries in external issue management systems using a set of relevant field values. Both methods take you to the workflow action detail page. open either in the current window or in a new one. you can define workflow actions that: • Are targeted to events that contain a specific field or set of fields. • Perform an external search (using Google or a similar web search application) on the value of a specific field found in an event. • POST workflow actions. which generate an HTTP POST request to a specified URI. such as a search that looks for the occurrence of specific combinations of ipaddress and http_status' field values in your index over a specific time range. or which belong to a particular event type. which create typical HTML links to do things like perform Google searches on specific values or run domain name queries against external WHOIS databases. Workflow actions have a wide variety of applications.

perhaps you have a field called http_status. For example. By default the field list is set to *. ip_server in Apply only to the following fields. and you would like a workflow action to apply only to events containing that field. which means that it matches all fields. but you only want the resulting workflow action to appear in events containing that field if the http_status is greater than or equal to 500. For example. To do this. you would declare http_status in the Apply only to the following fields setting. For more information about event types. 78 . we suggest you use event type scoping instead of field scoping. Narrow workflow action scope by event type Event type scoping works exactly the same way as field scoping. You can restrict workflow action scope by field. you can declare a comma-delimited list of fields in Apply only to the following fields. see "About event types" in this manual. If you want to have a workflow action apply only to events that have a set of fields. if you have a field called http_status. say you want a workflow action to only apply to events with ip_client and ip_server fields. you would enter ip_client. or a combination of the two. Narrow workflow action scope by field You can set up workflow actions that only apply to events that have a specified field or set of fields. you can optionally target workflow actions to a narrow grouping of events. To accomplish this you would first set up an event type called errors_in_500_range that is applied to events matching a search like http_status >= 500 You would then define a workflow action that has Apply only to the following fields set to http_status and Apply only to the following event types set to errors_in_500_range. if you declare a simple field listing of ip_* Splunk applies the resulting workflow action to events with either ip_client or ip_server as well as a combination of both (as well as any other event with a field that matches ip_*). When more than one field is listed the workflow action is displayed only if the entire list of fields are present in the event. You can also use wildcard matching to identify events belonging to a range of event types. by event type. You can also narrow the scope of workflow actions through a combination of fields and event types.Target workflow actions to a narrow grouping of events When you create workflow actions in Manager. Workflow action field scoping also supports use of the wildcard asterisk. You can enter a single event type or a comma-delimited list of event type into the Apply only to the following event types setting to create a workflow action that Splunk only applies to events belonging to that event type or set of event types. or combine event type scoping with field scoping. For example. For example. If you need more complex selecting logic.

they appear in dropdown menus associated with fields and events in your search results. 79 . Read on for instructions on setting up all three. you can have the workflow action appear in the event menus for those same events: Or you can choose to have it appear in both the event menu and the field menus for events containing a topic field. Clicking that link performs an HTTP GET request in a browser. such as a search engine or IP lookup service. It has the name of a particular Splunk documentation topic as its value. allowing you to pass information to an external web resource. Set up a GET workflow action GET link workflow actions drop one or more values into an HTML link. The menus for this event display the workflow action Google LicenseManagement. (The topic field turns up in webserver events associated with the access of Splunk documentation topics. you can define a workflow action that sets off a Google search for values of the topic field in events. Clicking on this workflow action sets off a Google search for the term LicenseManagement.) Depending on how you define the Google search workflow action in Manager. and it's one of three kinds of workflow actions that you can implement in Splunk. you can have it appear in field menus for events containing a topic field: Alternatively. This is an example of a "GET link" workflow action.Control workflow action appearance in field and event menus When workflow actions are set up correctly. For example. Note that in the event depicted above. the topic field has a value of LicenseManagement.

In the above example. Then you define a Label and URI as appropriate. if you have a field called topic in your events and you want its value to be included in the label for a Google workflow action. The URI field enables you to define the location of the external resource that you want to send your field values to. if you're working with a field that has an HTTP address as its value. go to the detail page and set Action type to link. and you want to pass the entire field value as a URI. Note: Variables passed in GET actions via URIs are automatically URL encoded during transmission. you should use the $! prefix to keep Splunk from escaping the field value. you use the name of the field enclosed by dollar signs. Labels can be static or include the value of relevant fields. This means you can include values that have spaces between words or punctuation characters. if the value for topic in an event is CreatefieldactionsinSplunkWeb the field action displays as Google CreatefieldactionsinSplunkWeb in the topic field menu. In the above example. this URI uses the GET method to submit 80 .To define a GET workflow action. However. set Link method to get. you might set the Label value to Google $topic$. For example. Here's an example of the setup for a GET link workflow action that sets off a Google search on values of the topic field in search results: The Label field enables you to define the text that is displayed in either the field or event workflow menu. when you declare the value of a field. Similar to the Label setting. See "Use the $! prefix to prevent escape of URL or HTTP form field values" below for more information.

the topic value to Google for a search. You can choose whether the workflow action displays in the event menu, the field menu(s), or both. You can also identify whether the link opens in the current window or a new window. You can also arrange for the workflow action to apply only to a specific set of events. You can indicate that the workflow action only appears in events that have a particular set of fields or which belong to a specific event type or set of event types.
Example - Provide an external IP lookup

You have configured configured your Splunk app to extract domain names in web services logs and specify them as a field named domain. You want to be able to search an external WHOIS database for more information about the domains that appear. Here's how you would set up the GET workflow action that helps you with this. In the Workflow actions details page, set Action type to link and set Link method to get. You then use the Label and URI fields to identify the field involved. Set a Label value of WHOIS: $domain$. Set a URI value of http://whois.net/whois/$domain$. After that, you can determine: • whether the link shows up in the field menu, the event menu, or both. • whether the link opens the WHOIS search in the same window or a new one. • restrictions for the events that display the workflow action link. You can target the workflow action to events that have specific fields, that belong to specific event types, or some combination of the two.
Set up a POST workflow action

POST workflow actions are set up in a manner similar to that of GET link actions. Go to the workflow action detail page and set Action type to link, set Link method to post, and define a Label and URI as appropriate. However, POST requests are typically defined by a form element in HTML along with some inputs that are converted into POST arguments. This means that you have to identify POST arguments to send to the identified URI. Note: Variables passed in POST link actions via URIs are automatically HTTP-form encoded during transmission. This means you can include values that have spaces between words or punctuation characters. However, if you're working with a field that has an HTTP address as its value, and you want to pass the entire field value as a URI, you should use the $! prefix to keep Splunk from escaping the field value. See "Use the $! prefix to prevent escape of URL or HTTP form field values" below for more information. These arguments are key and value combinations that will be sent to a web resource that responds to POST requests. On both the key and value sides of the argument, you can use field names enclosed in dollar signs to identify the field value from your events that should be sent over to the
81

resource. You can define multiple key/value arguments in one POST workflow action.
Example - Allow an http error to create an entry in an issue tracking application

You've configured your Splunk app to extract HTTP status codes from a web service log as a field called http_status. Along with the http_status field the events typically contain either a normal single-line description request, or a multiline python stacktrace originating from the python process that produced an error. You want to design a workflow action that only appears for error events where http_status is in the 500 range. You want the workflow action to send the associated python stacktrace and the HTTP status code to an external issue management system to generate a new bug report. However, the issue management system only accepts POST requests to a specific endpoint. Here's how you might set up the POST workflow action that fits your requirements:

82

Note that the first POST argument sends server error $http_status$ to a title field in the external issue tracking system. If you select this workflow action for an event with an http_staus of 500, then it opens an issue with the title server error 500 in the issue tracking system. The second POST argument uses the _raw field to include the multiline python stacktrace in the description field of the new issue. Finally, note that the workflow action has been set up so that it only applies to events belonging to the errors_in_500_range event type. This is an event type that is only applied to events carrying http_error values in the typical HTTP error range of 500 or greater. Events with HTTP error codes below 500 do not display the submit error report workflow action in their event or field menus.
Set up a secondary search that is dynamically populated with field values from an event

To set up workflow actions that launch dynamically populated secondary searches, you start by setting Action type to search on the Workflow actions detail page. This reveals a set of fields that you use to define the specifics of the search. In Search string enter a search string that includes one or more placeholders for field values, bounded by dollar signs. For example, if you're setting up a workflow action that searches on client IP values that turn up in events, you might simply enter clientip=$clientip$ in that field. Identify the app that the search runs in. If you want it to run in a view other than the current one, select that view. And as with all workflow actions, you can determine whether it opens in the current window or a new one. Be sure to set a time range for the search (or identify whether it should use the same time range as the search that created the field listing) using . If left blank it runs over all time by default. Finally, as with other workflow action types, you can restrict the the search workflow action to events containing specific sets of fields and/or which belong to particular event types.
Example - Launch a secondary search that finds errors originating from a specific Ruby On Rails controller

Say your company uses a web infrastructure that is built on Ruby on Rails. You've set up an event type to sort out errors related to Ruby controllers (titled controller_error), but sometimes you just want to see all the errors related to a particular controller. Here's how you might set up a workflow action that does this: 1. On the Workflow actions detail page, set up an action with the following Label: See other errors for controller $controller$ over past 24h. 2. Set Action type to Search. 3. Enter the following Search string: sourcetype=rails controller=$controller$ error=* 4. Set an Earliest time of -24h. Leave Latest time blank.

83

you might have a dedicated view for this information titled ruby_errors) and identify whether the action works in the current window or opens a new one.Create a workflow action that applies to all fields in an event You can update the Google search example discussed above (in the GET link workflow action section) so that it enables a search of the field name and field value for every field in an event to which it applies.Show the source of an event This workflow action uses the other special parameters to show the source of an event in your raw search data. Its Title is Show source. This results in a workflow action that searches on whichever field/value combination you're viewing a field menu for.google.Refers to the value of the field being clicked on.Refers to the latest time the event occurred.. Remember: Workflow actions using the @field_name and/or @field_value parameters are not compatible with event-level menus. It is used to distinguish similar events from one another..Refers to the name of the field being clicked on.5. Two of these special parameters are for field menus only.com/search?q=$@field_name$+$@field_value$.Refers to the namespace from which the search job was dispatched • @latest_time . arrange for the workflow action to only appear in events that belong to the controller_error event type.Refers to the sid of the search job that returned the event • @offset . and which contain the error and controller fields. It is not always available for all fields. Use special parameters in workflow actions Splunk provides special parameters for workflow actions that begin with an "@" sign.Refers to the offset of the event in the search job • @namespace . Those are the basics. • @field_name . The URI is /app/$@namespace$/show_source?sid=$@sid$&offset=$@offset$&latest_time=$@latest_ 84 . the resulting Google search is topic WhatisSplunkknowledge. settings. Example . The other special parameters are: • @sid . Using the Apply only to the following. You can also determine which app or view the workflow action should run in (for example. If you're looking at the field menu for topic=WhatisSplunkknowledge and select the Google this field and value field action. Example . • @field_value . The Action type is link and its Link method is get. All you need to do is change the title to Google this field and value and replace the URI of that action with http://www. They enable you to set up workflow actions that apply to all fields in the events to which they apply.

conf This topic coming soon. index. This workflow action is designed to simply open a new browser window pointing at the HTTP address value of the http field.conf Configure workflow actions through workflow_actions. Use the $! prefix to prevent escape of URL or HTTP form field values When you define fields to be used in workflow actions. In the case of GET workflow actions. source. Where you might normally set the URI field to $http$ for this workflow action in Manager. Example . it is often necessary to escape these fields so they can be safely passed via HTTP to some other external endpoint. it prevents HTTP form escape. learn how to set up and administrate workflow actions via Manager.It's targeted to events that have the following fields: _cd. This won't work if the new window is opened with an escaped HTTP address. In the meantime. Configure workflow actions through workflow_actions. you instead set it to $!http$ to keep the HTTP address from escaping. In these cases. So you use the $! prefix. 85 .Passing an HTTP address to a separate browser window Say you have a GET workflow action that works with a field named http. it prevents URL escape. In the case of POST workflow actions. you can use the $! prefix to prevent Splunk from automatically escaping the field value. host. Try setting this workflow action up in your app (if it isn't installed already) and see how it works. which has fully formed HTTP addresses as values. Sometimes this escaping may be undesirable.

This topic explains how to: 86 .1. You can assign one or more tags to any field/value combination (including event type. you might find that you have two host names that relate to the same computer. so you can search on them with one simple command. Tag that IPaddress value as mainoffice. You can also tag each IP address based on its location. you'll probably be using the Tags pages in Manager to curate the various collections of tags created by users of your Splunk implementation.Data normalization: Tags and aliases About tags and aliases About tags and aliases In your data.2. To search for all routers in San Francisco that are not in Building1. which refers to the IP addresses of the data sources within your company intranet. for example: SF or Building1. However. a method discussed in detail in "Tag and alias field values. To help you search more efficiently for these particular groups of event data. source. Most users will go with the simplest method--tagging field/value pairs directly in search results. host. An IP address of a router located in San Francisco inside Building 1 could have the tags router. you might have groups of events with related field values. • Use one tag to group a set of field values together. You can make IPaddress useful by tagging each IP address based on its functionality or location. • Give specific extracted fields multiple tags that reflect different aspects of their identity. like IP addresses or ID numbers. Example: Let's say you have an extracted field called IPaddress. and then search on that tag to find events with that IP address." in the User manual. You can use tags to: • Help you track abstract field values. see the following example. To understand how this could work.168. and Building1. You can tag all of your routers' IP addresses as router. you'd search for the following: tag=router tag=SF NOT (tag=Building1) Define and manage tags Define and manage tags Splunk provides a set of methods for tag creation and management. For example. you could have an IP address related to your main office with the value 192. When you search on that tag. you can assign tags to their field values. For example. as a knowledge manager. SF. You could give both of those values the same tag. which enable you to perform tag-based searches that help you quickly narrow down the results you want. or source type). Splunk returns events involving both host name values.

They enable you to quickly get a picture of the associations that have been made between tags and field/value pairs over time. Using the Tags pages in Manager The Tags pages in Manager provide three views of the tags in your Splunk implementation: • Tags by field value pair(s). You can also use this page to manage the permissions around the ability to manage a particular field/value combination with tags. locate that pairing and click on it in the Field::Value column. To see the list of tags for a specific field/value pair. Here's an example of a set of tags that have been defined for the eventtype=auditd_create field/value pair: 87 . Managing tag sets associated with specific field value pairs What if you want to see a list of all of the field/value pairs in your system that have tags associated with them? Furthermore. what if you want to review and even update the set of tags that are associated with a specific field/value pairing? Or define a set of tags for a particular field/value pair? The Tags by field value pair(s) Manager page is the answer to these questions. • List by tag name • Tags by unique ID. which you access by clicking All tag objects on the tags page. Navigate to the Tags pages by selecting Manager > Tags. They also allow you to create and remove these associations. Each of these pages enables you to manage your tag collection in different ways. It enables you to review and edit the tag sets that have been associated with particular field/value pairs. • Create new tags through Manager. This takes you to the detail page for the field/value pair. which you access by clicking List by field value pair(s) on the Tags page. • Disable or delete tags with Manager.• Use the tags pages in Manager to manage tags for your Splunk implementation.

what if you want to review and even update the set of field/value pairings that are associated with a specific tag? Or define a set of field/value pairings for a new tag? These questions are answered by the List by tag name Manager page. This practice aids with data normalization. however. the system enables you to define a set of tags for a new field/value pair. or associating existing tags with a different kind of field/value pair than they were originally designed to work with. 88 . It enables you to review and edit the sets of field/value pairs that have been associated with specific tags. When you create or update a tag list for a field/value pairing. When you click New on the Tags by field value pair(s) page. and can reduce confusion on the part of your users. The system will not prevent you from defining a list of tags for a nonexistent field/value pair. keep in mind that you may be creating new tags. (For more information see the "Organize and administrate knowledge objects" chapter of this manual. and click on the tag name in the Tag column. As a knowledge manager you should consider sticking to a carefully designed and maintained set of tags. locate the tag in the List by tag name.) Note: You may want to verify the existence of a field/value pair that you add to the Tags by field/value pair(s) page. Reviewing and updating sets of field value pairs associated with specific tags What if you want to see a list of all of the tags in your system that have one or more tags associated with them? Furthermore.You can add more tags. To see the list of field/value pairings for a particular tag. This page does not allow you to manage permissions for the set of field/value pairs associated with a tag. This takes you to the detail page for the tag. Here's an example displaying the various field/value pairings that the modify tag has been associated with. and delete them as well (if you have the permissions to do so).

When you create or update a set of field/value pairings for a tag. the system enables you to define a set of field/value pairings for a new tag.) Reviewing all unique field/value pair and tag combinations The Tags by unique ID page breaks out all of the unique tag name and field/value pairings in your system. (For more information see the "Organize and administrate knowledge objects" chapter of this manual. You may want to verify the existence of field/value pairs that you associate with a tag. As a knowledge manager you should consider sticking to a carefully designed and maintained set of tags. keep in mind that you may be creating new field/value pairings. or vice versa. This practice aids with data normalization. or if you want to maintain permissions at that level of granularity. Tags may already exist that serve the purpose you're trying to address. this page only lets you edit one-to-one relationships between tags and field/value pairs. When you click New on the List by tag name page. Disabling and deleting tags If you have a tag that you no longer want to use. you have the option of either disabling it or removing it. you can: 89 . and can reduce confusion on the part of your users. This page is useful especially if you want to disable or clone a particular tag and field/value association. and delete them as well (if you have the permissions to do so). Unlike the previous two pages. You can search on a particular tag to quickly see all of the field/value pairs with which it's associated. or want to have associated with a particular field/value pairing. If you have the permissions to do so.You can add field/value associations. The system will not prevent you from adding nonexistent field/value associations. Be wary of creating new tags.

Disable or delete the associations between a field/value pairing and a set of tags Use this method to bulk-remove the set of tags that is associated to a field/value pair. 90 . you don't have permission to delete it. try to be aware of downstream dependencies by their removal. even if it is associated to multiple field values. Delete the field/value pair. • Bulk disable or delete the associations between a field/value pair and a set of tags via the Tags by field value pair(s) page. • Bulk disable or delete a tag. When an association between a tag and a field/value pair is disabled.• Remove a tag association for a specific field/value pair in the search results. it stays in the system but is inactive until it is enabled again. see "Tag and alias field values" in the User manual. Delete the tag. you don't have permission to delete it. Navigate to Manager > Tags > List by tag name. Disable tags Depending on your permissions to do so. Note: You can also go into the edit view for a particular tag and delete a field/value pair association directly. Create aliases for fields Create aliases for fields You can create multiple aliases for a field. via the List by tag name page. Navigate to Manager > Tags > Tags by field value pair(s). If you don't see a delete link for the field/value pair. If you don't see a delete link for the tag. This method enables you to get rid of these associations in a single step. For more information. however. even if it is associated with dozens of field/value pairs. see "Curate Splunk knowledge with Manager" in this manual. Delete a tag with multiple field/value pair associations You can use Splunk Manager to completely remove a tag from your system. When you delete these associations. see "Curate Splunk knowledge with Manager" in this manual. This process enables you to search for the original field using any of its aliases. The original field is not removed. you can also disable tag and field/value associations using the three Tags pages in Manager. For more information. It does not remove the field/value pairing from your data. When you delete tags. This method enables you to get rid of all of these associations in one step. try to be aware of downstream dependencies that may be adversely affected by their removal. For information about deleting tag associations with specific field/value pairs in your search results. Note: You can also go into the edit view for a particular field value and delete a tag association directly.

) Note: Splunk's field aliasing functionality does not currently support multivalue fields.\d{1. You add your field aliases to props.3}\. 2.conf." as follows: [accesslog] EXTRACT-extract_ip = (?<ip>\d{1. This can be helpful if there are one or more fields in the lookup table that are identical to fields in your data.Important: Field aliasing is performed after key/value extraction but before field lookups.3}\.3}\. 91 .conf. You can define aliases for fields that are extracted at index time as well as those that are extracted at search time. see "Create field lookups from external data sources" in this manual.\d{1. but have been named differently. you can specify a lookup table based on a field alias. which you edit in $SPLUNK_HOME/etc/system/local/.3}) FIELDALIAS-extract_ip = ip AS ipaddress When you set up the lookup in props. see "Add fields at search time" in this manual. To alias fields: 1. Therefore. (We recommend using the latter directory if you want to make it easy to transfer your data customizations to other index servers. • You can include multiple field alias renames in one stanza. Restart Splunk for your changes to take effect.\d{1. For more information about field lookups. or your own custom application directory in $SPLUNK_HOME/etc/apps/.conf: FIELDALIAS-<class> = (<orig_field_name> AS <new_field_name>)+ • <orig_field_name> is the original name of the field." In the props. you can just use ipaddress where you'd otherwise have used ip: [dns] lookup_ip = dnsLookup ipaddress OUTPUT host For more information about search time field extraction. you would add a line that defines "ipaddress" as an alias for "ip. Example of field alias additions for a lookup Say you're creating a lookup for an external static table CSV file where the field you've extracted at search time as "ip" is referred to as "ipaddress. • <new_field_name> is the alias to assign to the field.conf file where you've defined the extraction. Add the following line to a stanza in props. For more information read "Look up fields from external data sources" in this manual.

to enable users to easily search for all activity on a group of similar servers. tagging that host with compliance will help your compliance searches. and for crafting more precise searches. Add a tag to the host field with Splunk Web To add a tag to a host field/value combination in Splunk Web: 1. separated by commas or spaces. You can tag the host field with one or more words. If you've changed the value of the host field for a given input. Tagging the host field with an alternate hostname doesn't change the actual value of the host field.In the search results. but multiple host tags. tagging the host field The value of the host field is set when an event is indexed. if your Splunk server is receiving compliance data from a specific host. It can be set by default based on the Splunk server hostname. Each event can have only one host name. 2. but the data that already exists in your index will have the old value. Host names vs. use the drop-down menu next to the host field value that you'd like to tag and choose Tag host=<current host value>. or extracted from each event's data. For example. and click Ok. With host tags. Use this to group hosts by function or type. The Tag This Field dialog box appears. but it lets you search for the tag you specified instead of having to use the host field value. 3.Perform a search for data from the host you'd like to tag. you can create a loose grouping of data without masking or changing the underlying host name. set for a given input. Tagging the host field for the existing data lets you search for the new host value without excluding all the existing data. You might also want to tag the host field with another host name if you indexed some data from a particular input source and then decided to change the value of the host field for that input--all the new data coming in from that input will have the new host field value.Tag the host field Tag the host field Tagging the host field is useful for knowledge capture and sharing. you can also tag events that are already in the index with the new host name to make it easier to search across your data set. 92 . Enter your tag or tags.

you can tag all firewall event types as firewall. Add tags to event types using Manager Splunk Manager enables you to view and edit lists of event types. Once an event type is tagged. • Select Event types. • Click the Manager link in the upper right-hand corner.conf. • Click Save to confirm your changes. Note: You can tag an event type when you create it in Splunk Web or configure it in eventtypes. you can search for it in the search bar with the syntax tag::<field>=<tagname> or tag=<tagname>: tag=foo tag::host=*local* 93 . For example. tag a subset of firewall event types as deny and tag another subset as allow. They also have role-based permissions that can prevent you from seeing and/or editing them. add or edit tags in the Tags field. Once you have tagged an event type. ♦ Note: Keep in mind that event types are often associated with specific Splunk apps. • On the detail page for the event type. any event type matching the tagged pattern will also be tagged. Any event type can have multiple tags. • Locate the event type you want to tag and click on its name to go to its detail page.Tag event types Tag event types Tag event types to add information to your data.

max_searches_per_cpu is set to two searches for every CPU in your system plus two. However. 94 .Manage your search knowledge Manage saved searches Manage saved searches Content coming soon For a basic overview of saving searches and sharing them with others. To understand the necessity of these two scheduler options. see "Schedule saved searches" in the User manual. The options are real-time scheduling and continuous scheduling: • Real-time scheduling ensures that scheduled searches are always run over the most recent time range. This topic will discuss saved searches from a knowledge management perspective. see "Save searches and share search results" in the User manual. These settings are managed at the saved search level via savedsearches. By default. Splunk automatically changes its scheduling option to continuous. the scheduler can only run one search at a time (1 = 25% of 4). but when a scheduled search is enabled for summary indexing. For more information about scheduling saved searches. including the use of the Saved search page in Manager.conf. How the search scheduler handles concurrent searches The Splunk search scheduler limits the number of scheduled searches that can be run concurrently. even when a number of searches are scheduled to run at approximately the same time and the scheduler can only run one search concurrently.conf. you need to understand how the search scheduler handles concurrent searches. set by the max_searches_perc setting in limits. • Continuous scheduling ensures that each scheduled run of a search is eventually performed. sets the maximum number of concurrent searches that can be handled by the scheduler to 25% of the max_searches_per_cpu value. they are always given priority over searches with continuous scheduling. The default. Configure the priority of scheduled searches Configure the priority of scheduled searches This topic discusses the two options you can use to control the priority of concurrent scheduled searches with the search scheduler. Splunk gives all scheduled searches real-time scheduling by default. Because of the way it works. So if your system only has one CPU. searches with real-time scheduling can end up skipping scheduled runs. even if the result is that those searches are delayed.

but you have multiple searches scheduled to run on an hourly basis over the preceding hour's data. but each search returns information for the time frame over which it was scheduled to run. and schedules it to run again at 1:06pm. The scheduler wakes up and attempts to run search A. the scheduler skips the 1:05-1:06 run of the search and schedules the next run of search A for 1:07:00pm (for the 1:06 to 1:07 period). At this point what 1:06:59pm happens next depends on whether search A is using real-time or continuous scheduling (see below). given how the scheduler works. say you have two saved. the scheduler does not advance the schedule and attempts to run the search for the 1:05 to 1:06pm period indefinitely. Real-time scheduling is the default for all scheduled searches. how is real-time scheduling different from continuous scheduling. if your scheduler can only run one search at a time.Note: We strongly recommend that you avoid changing limits. Because it takes 2 minutes to run. the next time period that search A would cover would be 1:06 to 1:07pm. but it cannot run because search B is still in process. It assumes there won't be any problems if some scheduled searches are 95 . The scheduler runs search B. and under what conditions would you prefer one option over the other? First. scheduled searches that for the purpose of simplicity we'll call A and B: • Search A runs every minute and takes 30 seconds to complete • Search B runs every 5 minutes and takes 2 minutes to complete Let's also say that you have a Splunk configuration that enables the search scheduler to run only one search at a time. • continuous scheduling. If search A is configured to have: • real-time scheduling. So. It is 1:05:30pm when search A completes. Example of real-time scheduling versus continuous scheduling So. and whatever the eventual search run time is. Both searches are scheduled to run at 1:05pm. It's designed to ensure that the search returns current data. Time 1:05:00pm 1:05:30pm 1:06:00pm Scheduler action The scheduler runs A for the 1:04 to 1:05 period. The scheduler continues to attempt to run search A until 1:06:59. search B won't complete until 1:07:30. what happens? The scheduler lines the searches up and runs them in consecutive order for the scheduled time period.conf settings unless you know what you are doing. The new search run time is based on the current scheduled run time (1:06:00pm).

edit. it always tries to run the real-time searches first.conf. The scheduler is designed to give searches with real-time scheduling priority over those with continuous scheduling. For more information. Configure the realtime_schedule option The system uses the realtime_schedule option in savedsearches. Because searches can't always run concurrently with others. but include a graphical user interface component. Design form searches 96 . you can reuse chunks of a search in multiple places. Splunk automatically sets this value to 0 for any scheduled search that is enabled for summary indexing. Design macro searches Design macro searches To simplify managing your searches. though you may find other uses for it. Splunk changes its scheduling option to continuous automatically. whether its a saved search or an ad hoc search.conf reference in the Admin Manual. see "Create and use search macros" in the User Manual and the macros. When a search is enabled for summary indexing. Continuous scheduling is used for situations where problems arise when there's any gap in the collection of search data. With macros. Configure and manage search macros You can view. This setting ensures that scheduled search periods are never skipped. • Set realtime_schedule to 0 to use continuous scheduling. as long as it returns up-to-the minute results in the most recent run of the search. Note: For more information about summary index searches.skipped. Note: Form searches also use search macros. These search macros can by any part of a search. which are parametrized chunks of a search. realtime_schedule= 0 | 1 • Set realtime_schedule to 1 to use real-time scheduling. This is set individually for each saved and scheduled search. such as an eval statement or search term. you can create saved searches that include macros. see "Use summary indexing for increased reporting efficiency" in the Knowledge Manager manual. this means that it may skip some search periods. With this setting the scheduler makes sure that it is always running the search over the most recent time range. This is the default value for a scheduled search.conf to determine the next run time of a scheduled search. and do not need to be a complete command. You can also specify whether or not the search macros take any or no arguments. and create search macros using Splunk Web's Manager > Advanced Search > Search macros page and macros. In general this is only important for searches that populate summary indexes.

Set up a default collection Each app should have a default collection set up for "unclassified" searches. For details on how to adjust the XML code for the navigation menu. see "Build navigation for your app" in the Developer manual. In the Search app. 97 ." "500. • Radio buttons that force the choice of particular field values (such as error codes like "404. Note: A default collection should also be set up for unclassified views and dashboards. • Multiple result panels that take the values received from one form and plug them into various hidden searches that in turn generate different kinds of charts and reports. When you do this. Define navigation to saved searches and reports Define navigation to saved searches and reports As a knowledge manager you should ensure that your saved searches and reports appear in the top-level navigation menus of your Splunk apps in a logical manner that facilitates ease of discovery. the default collection is Searches & Reports. as saved searches and reports are added without subsequent categorization. If you do not set up a default collection. This is the collection in which all newly saved searches appear. Form searches are created with XML code similar to that used for the construction of dashboards in Splunk. • Dropdown lists containing dynamically defined collections of search terms. To do this you need to customize the navigation menus for your apps. keep in mind that the nav code refers to lists of searches and reports as collections. They can include things like: • Open fields that take specific field values (such as user names or ID numbers) and can also display default values. Unclassified searches are any searches that haven't been explicitly identified in the nav menu code. for example." or "503").Design form searches Form searches are simplified search interfaces that help guide users in the creation of specific kinds of searches. The following subtopics describe various things you can do to organize your saved search and reports listings in the top-level navigation menu. For more information. If you fail to attend to your navigation menus. you need to work with the code behind the nav menu. see the "Forms: an introduction" chapter of the Developer manual. you will have to manually add saved searches to the nav code to see them in your app's top-level navigation menu. and inefficient. To manage the way your searches are saved and organized in the top-level navigation menu for an app. over time they may become overlong.

only saved searches and reports that are available to the app with which the navigation menu is associated are displayed. nested collections are used to group similar types of searches together: Dynamically group together saved searches Collections can be set up to dynamically group together saved searches that have matching substrings in their names. in the Search app example above. Note: In both cases. Going further. In the Search app. You can manually construct collections that group lists together by function. which means that the collection displays all searches with the matching substring whether or not they appear elsewhere in the navigation menu.Organize saved searches in nested collections As the number of saved searches and reports that are created for an app grows. which means that the collection only displays searches that haven't been manually added to another collection. a nested collection groups together all uncategorized searches with the string "admin" in their titles. For example. There are two ways that saved searches can be dynamically grouped together with matching substrings: • As a collection of uncategorized substring-matching searches. • As a collection of all substring-matching searches. you're going to want to find ways to organize those searches in a logical manner. you can set up nested collections that subdivide large collections into groups of smaller ones. 98 .

Plainly put. or the following Monday. on a frequent basis. If you only have to do this on an occasional basis. However. Each time Splunk runs this search it saves the results into a summary index that you designate.Run reports over long time ranges for large datasets more efficiently: Imagine you're using Splunk at a company that indexes tens of millions of events per day. And what's more. you set up a search that extracts the precise information you want. Perhaps an even more important advantage of summary indexing is its ability to amortize costs over different reports. Summary indexing allows the cost of a computationally expensive report to be spread over time. In the example we've been discussing. among other things. because Splunk has to sort through a huge number of events that are totally unrelated to web traffic in order to extact the desired data. the hourly search to populate the summary index with the previous hour's worth of data would take a fraction of a minute. It could also be used for a monthly report that needed the average response size per day. it can take a lot of time to search through very large data sets. With summary indexing. 99 . Use summary indexing to efficiently report on large volumes of data. leading to a lot of frustrated users. these reports will be statistically accurate because of the frequency of the index-populating search (for example. it may not be an issue. if you want to manually run searches that cover the past seven days. as well as for the same report over a different but overlapping time range. You could run this report on your primary data volume.Set up and use summary indexes Use summary indexing for increased reporting efficiency Use summary indexing for increased reporting efficiency Splunk is capable of generating reports of massive amounts of data (100 million events and counting). but its runtime would be quite long. You can then run searches and reports on this significantly smaller (and thus seemingly "faster") summary index. Generating the complete report without the benefit of summary indexing would take approximately 168 (7 days * 24 hrs/day) times longer. Thursday. broken out by site. Summary indexing use cases Example #1 . But running such reports on a regular schedule can be impractical--and this impracticality only increases exponentially as more and more users in your organization use Splunk to run similar reports. You want to set up a dashboard for your employees that. displays a report that shows the number of page views and visitors each of your Web sites had over the past 30 days. The same summary data generated on a Tuesday can be used for a report of the previous 7 days done on the Wednesday. you might run them on a summary index that is updated on an hourly basis). But that's not all--the fact that the report is included in a popular dashboard means it'll be run frequently. and this could significantly extend its average runtime. the amount of time it takes to compute such reports is directly proportional to the numbers of events summarized.

sistats. and the report should complete far faster than it would otherwise because it is searching on a smaller and better-focused dataset. Example #2 . with the exception that you use regular reporting commands in the latter search. If you create summary indexes using those methods and they work for you there's no need to update them. Then.But if you use summary indexing. because they create slightly larger indexes than the "manual" method does. sitop.summary index search commands if you are proficient with the "old-school" way of creating summary-index-populating searches. and sirare) when you define the search that will populate the summary index. daily. You'll then run your month-end report on this smaller summary index. or even hourly basis. if you wanted to run a search on the finished summary index that gave you average response times broken out by server. especially if the search you wanted to run on the finished summary index involved aggregate statistics. In fact. but you may notice a difference if the summary indexes you are creating are themselves fairly large. For example. If you use these commands you can use the same search string that you use for the search that you eventually run on the summary index. use the summary indexing reporting commands (sichart. 100 . Note: You do not have to use the si.command search. sitimechart.commands. for example. you can set up a saved search that collects website page view and visitor information into a designated summary index on a weekly. Defining index-populating searches without the special commands In previous versions of Splunk you had to be very careful about how you designed the searches that you used to populate your summary index. You can then run a report any time you want on the data in the summary index to obtain the latest count of the total number of downloads. you'd want to set up a summary-index-populating search that: • is scheduled to run on a more frequent basis than the search you plan to run against the summary index • samples a larger amount of data than the search you plan to run against the summary index. because it meant that you had to carefully set up the "index-populating" search in a way that did not provide incorrect results. In most cases the impact is insignificant. First. they may be more efficient: there are performance impacts related to the use of the si. You may also notice performance issues if you're setting up several searches to report against an index populated by an si. Using the summary indexing reporting commands If you are new to summary indexing. schedule a saved search to return the total number of downloads over a specified slice of time.Building rolling reports: Say you want to run a report that shows a running count of an aggregated statistic over a long period of time--a running count of downloads of a file from a Web site you manage. For another view. you can watch this Splunk developer video about the theory and practice of summary indexing. use summary indexing to have Splunk save the results of that search into a summary index.

collecting the top src_ip values for only the previous 24 hours each time. the only valid retrieval of the data is: index=<summary> source=<saved search name> | stats <args>. If you would like more information about setting up summary-index-populating searches that do not use the special summary index reporting commands. because we plan to run searches that cover a timespan of a year. but it takes forever to run because it scans across your entire index each time. Now. you still should arrange for the summary-index-populating search to run on a more frequent basis than the search that you later run against the summary index. The results of each daily search are added to an index named "summary": eventtype=firewall | sitop src_ip Note: Summary-index-populating searches are statistically more accurate if you schedule them to run and sample information on a more frequent basis than the searches you plan to run against the finished summary index.• contains additional search commands that ensure that the index-populating search is generating a weighted average. So in this example. In other words. with a time range of the past year: eventtype=firewall | top src_ip This search gives you the top source ips for the past year. Summary indexing reporting command usage example Let's say you've been running the following search. Save the extra search operators for the searches you run against the summary indexes. However. The search against the summary index cannot create or modify fields before the | stats <args> command. Important: The results from a summary-indexing optimized search are stored in a special format that cannot be modified before the final transformation is performed... do not pipe other search operators after the main summary indexing reporting command. You would schedule it to run on a daily basis. What you need to do is create a summary index that is composed of the top source IPs from the "firewall" event type. After your summary index is populated with results. | sistats <args>. this is the search you would use to get the top source_ips over the past year: 101 . let's say you save this search with the name "Summary . For example. not the search you use to populate it. don't include additional | eval commands and the like. You can use the following search to build that summary index. The summary index reporting commands take care of the last two points for you--they automatically determine the adjustments that need to be made so that your summary index is populated with data that does not produce statistically inaccurate results.firewall top src_ip" (all saved summary-index-populating searches should have names that identify them as such). Important: When you define summary-index-populating searches. search and report against that summary index using a search that specifies the summary index and the name of the search that you used to populate it. we set up a summary-index-populating search that samples information on a daily basis. This means that if you populate a summary index with . see "Configure summary indexes" in the Knowledge Management manual.

Splunk changes their sourcetype values to "stash" and moves the original sourcetype values to orig_sourcetype. even if the time range is a year or more. collecting information for each hour. Schedule the search to run on an appropriate interval. 102 . Note: Be sure to schedule the search so that there are no data gaps and overlaps. Summary indexing is an alert option for saved. This search should run fairly quickly. Setting up summary index searches in Splunk Web You can set up summary index searches through the Splunk Web interface. Remember that searches that populate summary indexes should run on a fairly frequent basis in order to create statistically accurate final reports. 2. Select Schedule this search if the search isn't already scheduled.. it filters out other data that have been placed in the summary index by other summary indexing searches.. below. Why do you have to do this? When events are gathered into a summary index. Go to the Search details page for the search. you might have the summary index collect data on a daily basis for the past day. you should have the summary search run on an hourly basis. scheduled searches. follow these steps: 1. If you're running searches over the past year's worth of data. For more on this see the subtopic on this issue. be aware that you need to use orig_sourcetype instead. If the search you're running against the summary index is gathering information for the past week. Once you determine the search that you want to use to populate a summary index. either by clicking Save search in the Search or Report Builder interface.|stats timechart avg(ip) by sourcetype.firewall top src_ip" |top src_ip Because this search specifies the search name.|stats timechart avg(ip) by orig_sourcetype.index=summary search_name="summary . Note: If you are running a search against a summary index that queries for events with a specific sourcetype value.. So instead of running a search against a summary index like . use .. or through the Searches and Reports page in Manager by selecting the name of a previously saved search or clicking New.

Note: You can also add field/value pairs to the summary index configuration in savedsearches. 4.conf. see "Configure summary indexes" in the Knowledge Manager manual. see "Set up multiple indexes" in the Admin manual. 6. These key/value pairs will be annotated to each event that gets summary indexed. and setting up alerts for searches. select a Perform actions value of always. see "Save searches and share search results". you can add field/value pairs to the summary index definition. Splunk will run the search on the schedule you've defined. The Summary index is the default summary index. For example. 103 . scheduling. making it easier to find them with later searches. Under Alert conditions. select Enable summary indexing. It's a good idea to create indexes that are dedicated to the collection of summary data. Enter the name of the summary index that the search will be populating. 5. For more information about saving. its data will not get saved to a summary index. Under Alert actions. "Schedule saved searches". you could add the name of the saved search populating the summary index (report=summary_firewall_top_src_ip) or the name of the index that the search populates (index=summary). and then search on those terms later. Note: If you enter the name of an index that does not exist. You may need to create additional summary indexes if you plan to run a variety of summary index searches. (Optional) Under Add fields.3. For more information. in this manual. and "Set alert conditions for scheduled searches". For information about creating new indexes.

Note: If you think you have gaps or overlaps in your summary index data. Gaps in a summary index are periods of time when a summary index fails to index events. The _time value of the event being summarized 2. For more information. if you were to schedule the search that populates the summary to run every 5 minutes when that search typically takes around 7 minutes to run. Splunk provides methods of detecting them and either backfilling them (in the case of gaps) or deleting the overlapping events. summary indexing is an alert option for scheduled saved searches. because the search won't run again when it's still running a preceding search. From the file. Note: Use the addinfo command to add fields containing general information about the current search to the search results going into a summary index. its search results are temporarily stored in a file ($SPLUNK_HOME/var/spool/splunk/<savedsearch_name>_<random-number>.stash). • splunkd goes down. Overlapping events skew reports and statistics created from summary indexes. you would have problems. Overlaps are events in a summary index (from the same search) that share the same timestamp. Gaps can occur if: • the summary-index-populating search takes too long to run and runs past the next scheduled run time. in this order of precedence: 1. When you run a saved search with summary indexing turned on. Summary indexing of data without timestamps To set the time for summary index events.Schedule the populating search to avoid data gaps and overlaps To minimize data gaps and overlaps you should be sure to set appropriate intervals and delays in the schedules of searches you use to populate summary indices. For example. The earliest (or minimum) time of the search 104 . General information added about the search helps you run reports on results you place in a summary index. don't arrange for an hourly search to gather data for the past 90 minutes. In other words. Splunk uses the addinfo command to add general information about the current search and the fields you specify during configuration to each result. How summary indexing works In Splunk Web. Splunk then indexes the resulting event data in in the summary index that you've designated for it (index=summary by default). see "Manage summary index gaps and overlaps" in the Knowledge Manager manual. Overlaps can occur if you set the time range of a saved search to be longer than the frequency of the schedule of the search. Splunk uses the following information.

each summarized event will have an _time value of now()-86400 (that's the start time of the search minus 86. The best practice for summarizing data without a time stamp is to manually create an _time value as part of your search. • splunkd outages: If splunkd goes down for a significant amount of time. Overlaps are events in a summary index (from the same index-populating search) that share the same timestamp. don't arrange for an hourly search to gather data for the past 90 minutes. tonight's summary will have an _time value equal to the earliest time of the search. if you were to schedule the index-populating search to run every five minutes. or 24 hours). Overlapping events skew reports and statistics created from summary indexes. you'll have a gap in the index data collection if the search ever takes more than five minutes to run.400 seconds. • Searches that run longer than their scheduled intervals: If the search you're using to populate the scheduled search runs longer than the interval that you've scheduled it to run on. Following on from the example above: |inputlookup asset_table | eval _time=now() Manage summary index gaps and overlaps Manage summary index gaps and overlaps The accuracy of your summary index searches can be compromised if the summary indexes involved have gaps or overlaps in their collected data. where no "earliest" value is specified) In the majority of cases. and the asset table does not contain an _time column. But if you are summarizing data that doesn't contain an _time field (such as data from a lookup). This means that every event without an _time field value that is found by this summary-index-populating search will be given the exact same _time value: the search's earliest time. In other words. Overlaps can occur if you set the time range of a saved search to be longer than the scheduled search interval. if you summarize the lookup "asset_table" every night at midnight.3. so the first method of discerning the summary index timestamp holds. the resulting events will have the timestamp of the earliest time of the search. For example. your events will have timestamps. there's a good chance you'll get gaps in your summary index data. depending on when the searches that populate the index are scheduled to run. then you're likely to end up with gaps because Splunk won't run a scheduled search again when a preceding search is still running. Gaps in summary index data can come about for a number of reasons: • A summary index initially only contains events from the point that you start data collection: Don't lose sight of the fact that summary indexes won't have data from before the summary index collection start date--unless you arrange to put it in there yourself with the backfill script. The current system time (in the case of an "all time" search. 105 . If I have set the time range of the search to be between -24h and +0s. For example.

including the names of the summary index searches. see "Use summary indexing for increased reporting efficiency" in the Knowledge Manager manual. You can indicate the precise times either by using relative time identifiers (such as -3d@d for "3 days ago at midnight") or by using UTC epoch numbers.py -app splunkdotcom -name "*" -et -mon@mon -lt @mon -dedup true -auth admin:changeme If this is your situation: You need to backfill the my_daily_search summary index search for the past year.py invocation If this is your situation: You need to backfill all of the summary index searches for the splunkdotcom App for the past month--but you also need to skip any searches that already have data in the summary index: Then you'd enter this into the CLI: . the authentication information.py script backfills gaps in summary index collection by running the saved searches that populate the summary index as they would have been executed at their regularly scheduled times for a given time range. The script is designed to prompt you for any required information that you fail to provide in the command line. Examples of fill_summary_index. when you run fill_summary_index. you must use -dedup true when you invoke it. if necessary you can use fill_summary_index. In addition. The script automatically computes the times during this range when the summary index search would have been run. When you enter the fill_summary_index.Note: For general information about creating and maintaining summary indexes. and the time range./splunk cmd python fill_summary_index.py to fill the summary index with data from the past month.py you can specify an App and schedule backfill actions for a list of summary index searches associated with that App. NOTE: To ensure that the fill_summary_index. even though your new summary index only started collecting data at the start of this week. In other words. running no more than 8 concurrent searches at any given time (to reduce impact on Splunk performance while 106 . you must provide the backfill time range by indicating an "earliest time" and "latest time" for the backfill operation. or simply choose to backfill all saved searches associated with the App.py script requires that you provide necessary authentication (username and password). Use the backfill script to add other data or fill summary index gaps The fill_summary_index.py script only executes summary index searches at times that correspond to missing data. If you know the valid Splunk key when you invoke the script. you can pass it in via the -sk option. The fill_summary_index.py commands through the CLI.

py usage and commands In the CLI. Note: <boolean> options accept the values 1. or no for "false.py -app search -name my_daily_search -et -y -lt now -j 8 -owner admin -auth admin:changeme Note: You need to specify the -owner option for searches that are owned by a specific user or role. the script 107 .py . Delete the 'fsidx*lock' file and you will be able to restart fill_summary_index." Field -et <string> -lt <string> -app <string> -name <string> Value Earliest time (required). In this directory. there will be a 'log' directory. If the index is -index <string> not provided. t. Specify a single saved search name. What to do if fill_summary_index.. -owner <string> The user context to use (defaults to "None"). Lines beginning with a # are considered comments and ignored. Either a UTC time or a relative time string. Use the wildcard symbol ("*") to specify all enabled. false. If only a username is provided. scheduled saved searches that have a summary index action. The my_daily_search summary index search is owned by the "admin" role. Latest time (required). If this attempt at auto index detection fails. true. Either a UTC time or a relative time string. f. -auth <string> The authentication string expects either <username> or <username>:<password>. Identifies the summary index that the saved search populates.py. there will be an empty temp file named 'fsidx*lock'. the backfill script tries to determine it automatically.the system collects the backfill data).py is interrupted while running In the app that you are invoking fill_summary_index.. Can specify multiple times to provide multiple names./splunk cmd python fill_summary_index. start by entering: python fill_summary_index. one per line. -names <string> Specify a comma seperated list of saved search names.and add the required and optional fields from the table below. The application context to use (defaults to None).py from (default: 'search'). the index defaults to "summary". Then you'd enter this into the CLI: . fill_summary_index. You do not want the script to skip searches that already have data in the summary index. or yes for "true" and 0. -namefile <filename> Specify a file with a list of saved search names.

-sleep <float> -j <int> -dedup <boolean> -showprogress <boolean> Number of seconds to sleep between each search. period=<integer>[smhd] (for example: 5m). If this option is unused its default is true. Default is 5 seconds. its default is false.requests the password interactively. This command identifies ranges of time in the index that include gaps or overlaps. its default is false Advanced options: these should not be used in almost all cases -trigger <boolean> -dedupsearch <string> -namefield <string> -timefield <string> When this option is set to false. you can run your scheduled saved search over the period of the gap and summary index the results with the backfill script (see below). run a search against the summary index that uses the overlap command. endtime= mm/dd/yyyy:hh:mm:ss (for example: 05/22/2008:00:00:00). When this option is set to true. Use these two commands to define a specific calendar time range: • StartTime: Time to start searching for missing entries. the script periodically shows the done progress for each currently running search that it spawns. Or use these two commands to define a period of time and the saved search to search for missing events with: • Period: Specify the length of time period to search. • SavedSearchName: Specify the name of the saved search to search for missing events with savedsearchname=string (NO wildcards). Indicates the field in the summary index data that contains the scheduled time of the saved search that generated that data Use the overlap command to identify summary index gaps and overlaps To identify gaps and overlaps in your data. If this option is not used. When this option is set to true. If this option is unused. If you suspect that a particular time range might include gaps and/or overlaps. you can identify it in the search by specifying a start time and end time or a period and a saved search name. the script runs each search but does not trigger the summary indexing action. following the | overlap command in the search string. starttime= mm/dd/yyyy:hh:mm:ss (for example: 05/20/2008:00:00:00). Maximum number of concurrent searches to run (default is 1). 108 . • EndTime: Time to stop searching for missing entries. Indicates the search to be used to determine if data corresponding to a particular saved search at a specific scheduled times is present Indicates the field in the summary index data that contains the name of the saved search that generated that data. the script doesn't run saved searches for a scheduled time if data already exists in the summary index. If you identify a gap.

If you do not use the summary index reporting commands. you need to enter the name of the summary index search that the search will populate. For more information about that method. sistats. Note: If you enter the name of an index that does not exist. Note: When you define the search that you'll use to build your index. The searches you create with them should be versions of the search that you'll eventually use to query the completed summary index. scheduled. and sirare. sitimechart. scheduling. You can't manually configure a summary index for a search in savedsearches." "Schedule saved searches". For more information about saving. In addition. These commands are prefixed with "si-": sichart. sitop. It's a good idea to create indexes that are dedicated to the collection of summary data. see the topic "Use summary indexing for increased reporting efficiency" in the Knowledge Manager manual. and setting up alerts for searches. and "Set alert conditions for scheduled searches". For information about creating new indexes. but its data will not get saved to a summary index. Configure summary indexes Configure summary indexes For a general overview of summary indexing and instructions for setting up summary indexing through Splunk Web. see "Set up multiple indexes" in the Admin manual. such as scheduling shorter time ranges for the populating search. Splunk will run the search on the schedule you've defined. see "Manually populate the summary index" in this topic.If you identify overlapping events.conf until the search is saved. 109 . most of the time you should use the summary indexing reporting commands in the search that you use to build your summary index. If you plan to run a variety of summary index searches you may need to create addtional summary indexes. You do this through the saved search dialog after selecting Enable summary indexing. you can manually delete the overlaps from the summary index by using the search language. in the User manual. see "Save searches and share search results. and setting the populating search to take a larger sample. You only have to worry about these issues if the search that you are using to build your index does not include summary index reporting commands. you can use the addinfo and collect search commands to create a search that Splunk saves and schedules. and which populates a pre-created summary index. The Summary index is the default summary index (the index that Splunk uses if you do not indicate another one). and has the Enable summary indexing alert option is selected. The summary index reporting commands automatically take into account the issues that are covered in "Considerations for summary index search definition" below.

• action. If you've used Splunk Web to save and schedule a search.summary_index. Splunk automatically generates a stanza in $SPLUNK_HOME/etc/system/local/savedsearches.<field> = <value>: Specify a field/value pair to add to every event that gets summary indexed by this search.Customize summary indexing for a saved.summary_index. summary-index-enabled search.conf as long as you have a new index for it to populate._name = <index> . see. but haven't used Splunk Web to enable the summary index for the search. • action. • collect: Summary indexing uses collect to index search results into the summary index. • addinfo: Summary indexing uses addinfo to to add fields containing general information about the current search to the search results going into a summary index.summary_index = 0 | 1: Set to 1 to enable summary indexing. add the name of the saved search that is populating the summary index (action.conf. If you've created a specific summary index for this search.summary_index.summary_index. For example. Set to 0 to disable summary indexing. This key is optional but we recommend that you never set up a summary index without at least one field/value pair.report = summary_firewall_top_src_ip). Search commands useful to summary indexing Summary indexing utilizes of a set of specialized reporting commands which you need to use if you are manually creating your summary indexes without the help of the Splunk Web interface or the summary indexing reporting commands.summary_index. • action._name = <index> action.summary_index. enter its name in <index>. scheduled. Defaults to summary.summary_index = 0 | 1 action.<field> = <value> • [<name>]: Splunk names the stanza based on the name of the saved and scheduled search that you enabled for summary indexing. the summary index that is delivered with Splunk. [ <name> ] action. You can define multiple field/value pairs for a single summary index search. Use | collect to index any search results into another index (using collect command 110 . see the topic "About managing indexes" in the Admin manual. For more information about manual index configuration.index = search). scheduled search When you use Splunk Web to enable summary indexing for a saved. Add | addinfo to any search to see what results will look like if they are indexed into a summary index. This field/value pair acts as a "tag" of sorts that makes it easier for you to identify the events that go into the summary index when you are performing searches amongst the greater population of event data. or the name of the index that the search populates (action. You can customize summary indexing for the search by editing this stanza. you can easily enable summary indexing for the saved search through savedsearches.This displays the name of the summary index populated by the search.

options). • overlap: Use overlap to identify gaps and overlaps in a summary index. overlap finds events of the same query_id in a summary index with overlapping timestamp values or identifies periods of time where there are missing events.

Manually configure a search to populate a summary index

If you want to configure summary indexing without using the search options dialog in Splunk Web and the summary indexing reporting commands, you must first configure a summary index just like you would any other index via indexes.conf. For more information about manual index configuration, see, see the topic "About managing indexes" in this manual. Important: You must restart Splunk for changes in indexes.conf to take effect. 1. Run a search that you want to summarize results from in the Splunk Web search bar. • Be sure to limit the time range of your search. The number of results that your search generates needs to fit within the maximum search result limits you have set for searching. • Make sure to choose a time interval that works for your data, such as 10 minutes, 2 hours, or 1 day. (For more information about setting intervals in Splunk Web, see "Scheduling saved searches" in the User Manual.) 2. Use the addinfo search command. Append | addinfo to the end of your search. • This command adds information about the search to events that the collect command requires in order to place them into a summary index. • You can always add | addinfo to any search to preview what the results of a search will look like in a summary index. 3. Add the collect search command. Append |collect index=<index_name> addtime=t marker="info_search_name=\"<summary_search_name>\"" to the end of the search. • Replace index_name with the name of the summary index • Replace summary_search_name with a key to find the results of this search in the index. • A summary_search_name *must* be set if you wish to use the overlap search command on the generated events. Note: For the general case we recommend that you use the provided summary_index alert action. Configuring via addinfo and collect requires some redundant steps that are not needed when you generate summary index events from scheduled searches. Manual configuration remains necessary when you backfill a summary index for timeranges which have already transpired.
Considerations for summary index search definition

If for some reason you're going to set up a summary-index-populating search that does not use the summary indexing reporting commands, you should take a few moments to plan out your approach. With summary indexing, the egg comes before the chicken. Use the search that you actually want to report on to help define the search you use to populate the summary index.
111

Many summary searches involve aggregated statistics--for example, a report where you are searching for the top 10 ip addresses associated with firewall offenses over the past day--when the main index accrues millions of events per day. If you populate the summary index with the results of the same search that you run on the summary index, you'll likely get results that are statistically inaccurate. You should follow these rules when defining the search that populates your summary index to improve the accuracy of aggregated statistics generated from summary index searches.
Schedule a shorter time range for the populating search

The search that populates your summary index should be scheduled on a shorter (and therefore more frequent) interval than that of the search that you eventually run against the index. You should go for the smallest time range possible. For example, if you need to generate a daily "top" report, then the report populating the summary index should take its sample on an hourly basis.
Set the populating search to take a larger sample

The search populating the summary index should seek out a significantly larger sample than the search that you want to run on the summary index. So, for example, if you plan to search the summary index for the daily top 10 offending ip addresses, you would set up a search to populate the summary index with the hourly top 100 offending ip addresses. This approach has two benefits--it ensures a higher amount of statistical accuracy for the top 10 report (due to the larger and more-frequently-taken overall sample) and it gives you a bit of wiggle room if you decide you'd rather report on the top 20 or 30 offending ips. The summary indexing reporting commands automatically take a sample that is larger than the search that you'll run to query the completed summary index, thus creating summary indexes with event data that is not incorrectly skewed. If you do not use those commands, you can use the head command to to select a larger sample for the summary-index-populating search than the search that you run on the summary index. In other words, you would have | head=100 for the hourly summary index populating search, and | head=10 for the daily search of the completed summary index.
Set up your search to get a weighted average

If your summary-index-populating search involves averages, and you are not using the summary indexing reporting commands, you need to set that search up to get a weighted average. For example, say you want to build hourly, daily, or weekly reports of average response times. To do this, you'd generate the "daily average" by averaging the "hourly averages" together. Unfortunately, the daily average becomes skewed if there aren't the same number of events in each "hourly average". You can get the correct "daily average" by using a weighted average function. The following expression calculates the the daily average response time correctly with a weighted average by using the stats and eval commands in conjunction with the sum statistical aggregator. In this example, the eval command creates a daily_average field, which is the result of dividing the average response time sum by the average response time count.

112

| stats sum(hourly_resp_time_sum) as resp_time_sum, sum(hourly_resp_time_count) as resp_time_count | eval daily_average= resp_time_sum/resp_time_count | .....
Schedule the populating search to avoid data gaps and overlaps

Along with the above two rules, to minimize data gaps and overlaps you should also be sure to set appropriate intervals and delays in the schedules of searches you use to populate summary indexes. Gaps in a summary index are periods of time when a summary index fails to index events. Gaps can occur if: • splunkd goes down. • the scheduled saved search (the one being summary indexed) takes too long to run and runs past the next scheduled run time. For example, if you were to schedule the search that populates the summary to run every 5 minutes when that search typically takes around 7 minutes to run, you would have problems, because the search won't run again when it's still running a preceding search. Overlaps are events in a summary index (from the same search) that share the same timestamp. Overlapping events skew reports and statistics created from summary indexes. Overlaps can occur if you set the time range of a saved search to be longer than the frequency of the schedule of the search, or if you manually run summary indexing using the collect command.

Example of a summary index configuration

This example shows a configuration for a summary index of Web statistics as it might appear in savedsearches.conf. The keys listed below enable summary indexing for the saved search "MonthlyWebstatsReport", and append the field Webstatsreport with a value of 2008 to every event going into the summary index.

#name of the saved search = Apache Method Summary [Apache Method Summary] # sets the search to run at each search interval counttype = always # enable the search schedule enableSched = 1 # search interval in cron notation (this means "every 5 minutes") schedule = */12**** # id of user for saved search userid = jsmith # search string for summary index search = index=apache_raw startminutesago=30 endminutesago=25 | extract auto=false | stats coun # enable summary indexing action.summary_index = 1 #name of summary index to which search results are added action.summary_index._name = summary # add these keys to each event action.summary_index.report = "count by method"

113

conf controls the alert actions (including summary indexing) associated with saved searches.Other configuration files affected by summary indexing In addition to the settings you configure in savedsearches.conf without explicit instructions from Splunk staff.conf specifies index configuration for the summary index. Alert_actions. Caution: Do not edit settings in alert_actions. Indexes. there are also settings for summary indexing in indexes.conf and alert_actions. 114 .conf.conf.