Splunk Search

How is search performance affected using tags vs host names?

Explorer

Suppose I have the following list of hosts and sourcetypes

  • hosts = h1, h2, ... h10
  • sourcetypes = s1, s2, ... s10

And the following tags have been configured

  • productionServers: host=h1, host=h2, host=h3, host=h4, host=h5
  • component1: host=h3

Will the following two searches take same amount of time or will the first search be slower?

tag=productionServers tag=component1<some_query>
host=h3 <some_query>
0 Karma

SplunkTrust
SplunkTrust

When you run the search with tags, it just replaces the values with OR condition.

For eg:

tag=productionServers tag=component1 is similar to (host=h1 OR host=h2 OR host=h3 ...) host=h3. So your first search searches for more than one host and brings more events and hence might take more time depends on the events for each hosts

You can verify this bby looking at the job inspector and search log will give you the exact search splunk runs.

http://docs.splunk.com/Documentation/Splunk/6.2.0/Knowledge/ViewsearchjobpropertieswiththeJobInspect...

Explorer

I ran the two searches multiple times. The first one (using tags) took 60 seconds while the second one took 70 seconds which is contrary to what is expected.


My question was doesn't splunk automatically reduce

(host=h1 OR host=h2 OR host=h3 OR host=h4 OR host=h5 ) host=h3 

to

host=h3

?


In the search inspector, I can see normalizedSearch and remoteSearch.


For another search:

search tag=t1 tag=t2 sample_text calculated_field=calculated_value

both normalizedSearch and remoteSearch showed

litsearch ( ( t1 expanded ) ( t2 expanded ) sample_text ( "calculated_field"=calculated_value OR ( sourcetype=sourcetype1_of_calculated_field ) OR ( sourcetype=sourcetype2_of_calculated_field ) ) ) 
( ( ( ( host=h1 or host=h2 ..... ) ) OR host=h6 ) OR ( ( ( host=h3 or host=h4 ..... ) ) ) OR ( host=h5 ) ) 
| 
litsearch ( ( t1 expanded ) ( t2 expanded ) sample_text ( "calculated_field"=calculated_value OR ( sourcetype=sourcetype1_of_calculated_field ) OR ( sourcetype=sourcetype2_of_calculated_field ) ) ) 
( ( ( ( host=h1 or host=h2 ..... ) ) OR host=h6 ) OR ( ( ( host=h3 or host=h4 ..... ) ) ) OR ( host=h5 ) ) 
 | 
addinfo type=count label=prereport_events | fields keepcolorder=t "_time" "prestats_reserved_*" "psrsvd_*" | bin _time span=1d | prestats count by _time

The total number of host=h1 or host=h2 ..... in the normalizedSearch was over 15000. From where are those additional hosts list of 15000 coming?

0 Karma

SplunkTrust
SplunkTrust

Why do you think splunk automatically reduce (host=h1 OR host=h2 OR host=h3 OR host=h4 OR host=h5 ) host=h3 to host=h3 for you?. Splunk just runs the search what you have asked it to run.

0 Karma

Explorer

Not sure why but I thought that it would. 🙂

If I understood correctly, tags are meant to be used in situations like the ones I mentioned. If Splunk doesn't optimize queries, then if one wants queries to run fast, then tags shouldn't be used which makes tags less useful.

0 Karma

Community Manager
Community Manager

Hi @ranjithfs1

Based on the context of your post, you might find this app by @martin_mueller very interesting and useful 🙂
https://splunkbase.splunk.com/app/2871

Thanks @renjith.nair for being so helpful to other users here on Answers!

SplunkTrust
SplunkTrust

Tags are literally for tagging :-). It helps you to simplify your searches and group events. I'm not aware of any automatic performance improvement by splunk when you use tags.

http://docs.splunk.com/Documentation/Splunk/6.2.0/knowledge/Abouttagsandaliases

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!