We need to decide soon how much storage to allocate to the hot/warm volume versus the cold one. Therefore, I would like to generate a report which would show for each index the time back we search against.
Is it possible?
its possible but not so straight forward
look at all your saved searches and alerts via
| rest like this:
|rest/servicesNS/-/-/saved/searches | table search dispatch.earliest_time dispatch.latest_time
now you have the time window for your searches and the search syntax.
as often happen, maybe not all searches have
index=<something> maybe some has
index IN(a b c) or maybe
index=a OR index=b or maybe they donrt specify an index at all and you have to go to the roles to figure out what are the deault indexes for this role.
in any case, not always easy to
rex it out
now, after you did that , you can search the
_internal indexes to look at ad-hoc searches
you also probably want to check your dashboards and how their panels and base searches are defined as well.
i will recommend to focus on what you need, and consider the following: do you have fast disk for hot / warm?
how much data do you bring daily?
according to surveys, most users use 95% of their searches on data consumed in the last 24 hours.
follow this guideline, and i am pretty sure you will be safe
Sure you can! I built this as a fun exercise so no promises that it perfectly meets your needs (which is why I'm not posting as an answer), but check this out. My goal was to show the count of the
earliest days by index name. Hopefully it helps getting you started. You'll definitely want to set the Y axis to a "Log" scale.
index=_audit action=search info=granted search=* NOT "search_id='scheduler" NOT "search='|history" NOT "user=splunk-system-user" NOT "search='typeahead" NOT "search='| metadata type=* | search totalCount>0" | rex max_match=0 "index\s*=\s*(?<Searched_Indexes>\w+)" | mvexpand Searched_Indexes | table _time Searched_Indexes apiStartTime apiEndTime | where isnotnull(Searched_Indexes) | eval query_start = strptime(apiStartTime, "'%c'") | eval _time = query_start | timechart useother=false limit=0 count by Searched_Indexes
The vast majority (99% on average) of searches are placed within the past week or so.
If you really want to do it, throw on this at the end:
| where _time > now() - 86400*30
For the past 30 days.