When I ran a search spanning an entire year it took 241 seconds. If I immediately rerun the search the time plummets to ~60 seconds. Why? Is this a Splunk or Disk optimization?
Background:
hot/warm sit on fast disk.
coldlib resides on not as fast, bigger disk.
Regardless of the search I run, when the data is polled the first time it's always a slower reply. When the I rerun the same exact search over the same exact disks the times drop considerably. Who's responsible? (who can I thank?) Splunk or Disks....and is it that easy, or is it more complex? I understand that searching back onto colddb disks will require a slower retrieval vs. warm/hotdb. The question is more of a lower level, backend one. But one I want to share with my user base when I advise them how to tune their searches and what will happen when they rerun the search.
I've looked through a lot of the Answers and on Splunk's site but can't really find the answer. This group is outstanding, so I'm leaning on you. Any insight is appreciated.
pstein
Splunk caches search results for a set period of time (configurable by the admin)
So if you run the exact same search while the data is still cached - the search will return results much faster the second time
But the search must truly be identical
It's not exactly for a set period of time, it's related to your user (and/or role) disk quota - https://docs.splunk.com/Documentation/Splunk/8.0.1/Admin/authorizeconf (though search results expiration time does factor in)
I've had different experiences - so long as the search is truly identical
Which it is/was. Identical search run repeatedly produced different execution times.
Did the second run take zero seconds? If not then it didn't reuse the previous search's results from dispatch where the disk quota settings apply.
No. It didn't take zero time to execute.
First/Original run: 6,273,099 events in 142.291 seconds
Second run: 6,273,099 events in 56.79 seconds.