Which of these would be the most efficient/fast/best way to start filtering for a search?
index=foo | ...
or
source="/var/log/bar/baz.log" | ...
or
index=foo source="/var/log/bar/baz.log" | ...
We're going to have an index that will have several **/*.log
sources, each with similar but unique data formats. We'll always know the data source and index for these queries. I'm wondering the best way start my queries.
index=foo source="/var/log/bar/baz.log" | ...
http://docs.splunk.com/Documentation/Splunk/6.5.2/Search/Writebettersearches
From the documentation
Restrict your search to the specific host, index, source, source type, or Splunk server whenever possible. Read more about using fields in your searches in the next section.
index=foo source="/var/log/bar/baz.log" | ...
http://docs.splunk.com/Documentation/Splunk/6.5.2/Search/Writebettersearches
From the documentation
Restrict your search to the specific host, index, source, source type, or Splunk server whenever possible. Read more about using fields in your searches in the next section.
That still isn't clear to me whether specifying both helps any over just specifying the most specific which would be source in my case. I was thinking Splunk might already know that this source is only in this index and optimize it, or already index the sources. I guess I'd have to profile using just the source vs the index and the source to be sure. But thanks for the info.
In this case, having the extra data is certainly not going to hurt, but really, you ALWAYS want to specify the index, because then splunk does not have to look ANYWHERE ELSE. Giving it the source as well as that helps it narrow further.
Splunk would have figured out - almost certainly, after a glance at the summary stats - that there were none of that source anywhere else, by checking all the other indexes. But why make it go to even that meager effort?
Ok, makes sense. I was just being paranoid about writing the shortest, clearest most concise query possible. And wanted to make sure specifying both source and index wouldn't cause Splunk to do extra work.
Good goal. In this case quite the reverse, I think. If you can limit the search to a single index, or a limited set of them, then you'll (in theory) save splunk a slight bit of time in the search parsing. Overall run time is unlikely to be affected much, again, in my somewhat limited experience.