Suppose I have the following list of hosts and sourcetypes
And the following tags have been configured
Will the following two searches take same amount of time or will the first search be slower?
tag=productionServers tag=component1<some_query> host=h3 <some_query>
When you run the search with tags, it just replaces the values with OR condition.
tag=productionServers tag=component1 is similar to (host=h1 OR host=h2 OR host=h3 ...) host=h3. So your first search searches for more than one host and brings more events and hence might take more time depends on the events for each hosts
You can verify this bby looking at the job inspector and search log will give you the exact search splunk runs.
I ran the two searches multiple times. The first one (using tags) took 60 seconds while the second one took 70 seconds which is contrary to what is expected.
My question was doesn't splunk automatically reduce
(host=h1 OR host=h2 OR host=h3 OR host=h4 OR host=h5 ) host=h3
In the search inspector, I can see normalizedSearch and remoteSearch.
For another search:
search tag=t1 tag=t2 sample_text calculated_field=calculated_value
both normalizedSearch and remoteSearch showed
litsearch ( ( t1 expanded ) ( t2 expanded ) sample_text ( "calculated_field"=calculated_value OR ( sourcetype=sourcetype1_of_calculated_field ) OR ( sourcetype=sourcetype2_of_calculated_field ) ) ) ( ( ( ( host=h1 or host=h2 ..... ) ) OR host=h6 ) OR ( ( ( host=h3 or host=h4 ..... ) ) ) OR ( host=h5 ) ) | litsearch ( ( t1 expanded ) ( t2 expanded ) sample_text ( "calculated_field"=calculated_value OR ( sourcetype=sourcetype1_of_calculated_field ) OR ( sourcetype=sourcetype2_of_calculated_field ) ) ) ( ( ( ( host=h1 or host=h2 ..... ) ) OR host=h6 ) OR ( ( ( host=h3 or host=h4 ..... ) ) ) OR ( host=h5 ) ) | addinfo type=count label=prereport_events | fields keepcolorder=t "_time" "prestats_reserved_*" "psrsvd_*" | bin _time span=1d | prestats count by _time
The total number of host=h1 or host=h2 ..... in the normalizedSearch was over 15000. From where are those additional hosts list of 15000 coming?
Why do you think splunk automatically reduce (host=h1 OR host=h2 OR host=h3 OR host=h4 OR host=h5 ) host=h3 to host=h3 for you?. Splunk just runs the search what you have asked it to run.
Not sure why but I thought that it would. 🙂
If I understood correctly, tags are meant to be used in situations like the ones I mentioned. If Splunk doesn't optimize queries, then if one wants queries to run fast, then tags shouldn't be used which makes tags less useful.
Based on the context of your post, you might find this app by @martin_mueller very interesting and useful 🙂
Thanks @renjith.nair for being so helpful to other users here on Answers!
Tags are literally for tagging :-). It helps you to simplify your searches and group events. I'm not aware of any automatic performance improvement by splunk when you use tags.