Splunk Search
Highlighted

How to find out how long it took the Search Job Inspector to search each bucket?

Explorer

After running a search, under the Inspect job, I am able to view the searchTotalBucketCount.

I need to find, how long it took to search each bucket? The only time available is the TotalRunDuration.

0 Karma
Highlighted

Re: How to find out how long it took the Search Job Inspector to search each bucket?

Esteemed Legend

Why do you need to know this?

0 Karma
Highlighted

Re: How to find out how long it took the Search Job Inspector to search each bucket?

Explorer

It is as part of one of our performance requirement for the rare searches.

As the splunk document says, Rare search ** - Similar to a super-sparse search, but receives assistance from bloom filters, which help eliminate index buckets that do not match the search request. Rare searches return results anywhere from 20 to 100 times faster than does a super-sparse search. ** From 10 to 50 index buckets per second. I/O bound

0 Karma
Highlighted

Re: How to find out how long it took the Search Job Inspector to search each bucket?

SplunkTrust
SplunkTrust

The average time to search each bucket is calculated as total search time, divided by number of buckets.

Regardless of that, it sounds like your organization has chosen to focus on the wrong thing.

Your organization needs to understand that "rare search" is a description of the results, not an actual type of search. You can't check a box called "rare" to get better results, and you can't know whether a search will qualify as rare until you get the results.

If your database -- of FRUIT SHIPMENT events -- consists mostly of apples and oranges and a few mangos, but occasionally there is a durian and once or twice a year there is a rambutan, then a search that is otherwise exactly the same can be dense, sparse, very sparse or rare, depending on which fruit you are asking about. The only way to know WHICH "type" a search might have been is to inspect how that search ended up processing the request.

In the case of the search for rare rambutans, splunk was able to see from its bloom filters that there were only rambutans in one week in February and one week in September, and they came in on shipments from Malaysia. After that, it went to those weeks and checked every shipment from Malaysia and then looked through every single fruit in those shipments to see if it was a rambutan.

The durian events happened in most weeks, and came from a bunch of countries, so the bloom filters didn't save nearly as much time as they saved for the rambutan. Very few results, very sparse search.

A few mangos came in pretty much every day, from shipments from all over the world, so everything had to be looked at to pull out the mangos. There weren't many mangos, though, so per result, this sparse search was a lot of work.

Apples, on the other hand, were coming in every single hour of every day, so the search returned lots of events per unit time, because there were lots of matching events everywhere splunk looked. The dense search still took the same amount of time to look through each shipment of fruit, but it returned lots of answers in that time, because there were lots of qualifying events.

When your organization is trying to use "search type" for planning purposes-- or, worse, trying to baseline and track performance metrics off of the way the organization has previously labeled a search -- then it is really missing the mark.

The key is designing each search be the most efficient it can be in order to meet its function. If you can take advantage of high-cardinality indexed fields to make use of the bloom filters, then by all means do. Likewise, when any particular search seems to be taking a lot of time relative to its results, then it is a candidate for refactoring, accelleration, creating summary indexes, or any of a dozen other techniques to increase efficiency.

0 Karma