With the simplest search:
index=checkpoint action=accept | head 1
The normalizedSearch (under Job Inspect, 8.34s) is:
litsearch index=checkpoint ( ( ( sourcetype=opsec_audit ) AND ( ( ( ( ( ( sourcetype=WinRegistry ) AND ( ( registry_type=accept ) ) ) OR ( ( sourcetype=fs_notification ) AND ( ( action=accept ) ) ) ) OR ( vendor_action=accept ) ) ) ) ) ) OR ( ( ( ( sourcetype=fe_json ) AND ( ( "alert.action"=accept ) ) ) OR ( ( sourcetype=fe_xml ) AND ( ( "alerts.alert.action"=accept ) ) ) OR ( ( source="/nsm/bro/logs/current/notice.log" ) AND ( ( EXTRA_FIELD_18=accept ) ) ) ) OR ( action=accept ) ) | litsearch index=checkpoint action=accept | fields keepcolorder=t "*" "_bkt" "_cd" "_si" "host" "index" "linecount" "source" "sourcetype" "splunk_server" | prehead limit=1 null=false keeplast=false
A slight modification of the search to put the field search after the first pipe makes the junk go away:
index=checkpoint accept | search action=accept | head 1
The normalizedSearch (under Job Inspect, 3.1s) is now:
litsearch index=checkpoint accept | search action=accept | fields keepcolorder=t "*" "_bkt" "_cd" "_si" "host" "index" "linecount" "source" "sourcetype" "splunk_server" | prehead limit=1 null=false keeplast=false
This is true even when he value "accept" is not before the first pipe.
Why does Splunk insert junk into the normalized search with a field search before the first pipe? The junk increases search time and in some cases where "NOT" OR "!" it can return "no results".
Splunk Version
6.2.3
Splunk Build
264376
Current App
Search & Reporting
The "junk" is an expansion of reverse lookups that happens because of various CIM-compliant TAs you have installed. I'll explain the concept of reverse lookups, but first let's go back to automatic lookups. Say you have a sourcetype bob
defined with an automatic lookup. In props.conf:
[bob]
LOOKUP-actions = boblookup someinputfield OUTPUT action
And in boblookup.csv you have:
someinputfield,action
potato,deny
tomato,accept
blueberry,accept
Now your normal expectation is that when you do a search on sourcetype=bob
that events will be matched against the lookup field and have a new field named action
when someinputfield
has a value of "tomato, potato, or blueberry". But, this same thing can be applied in reverse too.
If you search on action=accept
, then Splunk can look through all of its config files and reason-out something like this:
Sourcetype bob has a lookup that outputs a field named action
based on this CSV file. I see here in the CSV file that action=accept
is returned whenever sometinputfield=blueberry
or someinputfield=tomato
. So there is an equivalency here:
( sourcetype = bob AND ( someinputfield = blueberry OR someinputfield = tomato ) )
This is the fundamental step of a reverse lookup - the goal is to attempt to make automatic lookup fields searchable. This is a necessary evil for CIM-compliant apps like Enterprise Security because of how often they use automatic lookups to normalize field names and values.
There's a whole longer discussion here about the performance impacts around this. While it made your example situation slower, there are many other counter examples where this approach (up to a point) speeds things up.