I am trying to determine how many searches are searching on a particular index per day.
I know how much data the index has, but I need to know if people are actually searching on that data or not.
Is there any way to get this information? The searches I have tried only have gotten search name or previously run search strings, but I cannot break it down by index using that information alone (search string/saved search may rely on a default index for the user, which will not be in the search string).
For an individual job you can determine this from its search.log, look for these lines:
07-14-2015 22:39:25.385 INFO IndexScopedSearch - 0000000003D1CA60 LISPY for index=main is lispy='[ AND ]' ct=2147483647 et=0 lt=2147483647 dbsize=6
Trouble is, these logs aren't indexed by default so you can't easily run searches against them. You could of course index them yourself, but keep an eye on extra volume - there can be lots and lots of search.log files!
Maybe this one is not 100% accurate, but it is a starting point.
Search in the
action=search and filter out the saved searches and typeahead or history ones and you can get a pretty nice count on the
index= values used in the searches:
index=_audit action="search" search="*" NOT user="splunk-system-user" savedsearch_name="" NOT search="\'|history*" NOT search="\'typeahead*" | rex "index\=(?<myIndex>[^\s,|']*)" | stats count by myIndex
There is one problem with this search, it can return
index=* and then you would also need to check the user and its default search index... but, haven't found another way to achieve it yet.
Hope this helps ...
This should be achievable with a sprinkling of
| rest services/authorization/roles and
| rest services/authentication/users - join user to the audit event, join his roles to that, get default and allowed indexes, augment audit events that don't have explicit indexes. Should at least fill all the gaps where the index is not calculated in a subsearch etc.... but I'm going to bed 😛 so this is left as an exercise for the reader.
You don't really want to know whether the index was searched, I believe. I think you need to know whether any data was actually returned from the index. When someone runs a "search Index=*" that happens to include an inappropriate index is not something you want to count as "using" that index.
Seems like some of the data returned in the job inspector should be useful here.