This is mostly just a curiosity, motivated by this post on how to compare a particular time interval across multiple larger time periods. Effectively the solution seems to be to generate a list of time intervals and run map subsearches on each entry.
When I have multiple time periods that I'd like to run stats on, I typically use a multisearch command followed by a chart, as follows:
| multisearch [ index=potato et=<et1> lt=<lt2> | eval series=1 ]
[ index=potato et=<et2> lt=<lt2> |eval series=2 ]
.
.
.
[ index=potato et=<etn> lt=<ltn> | eval series=n ]
| timechart count by series
I suppose you could make it work by substituting the et's and lt's via subsearch, but it won't work if the number of time intervals, n, is also dynamically generated by some prior search.
I know you can use a number of different techniques, but they all have different drawbacks.
At this point, would you just have to resort to REST to schedule searches? How would we tie the data together? I'm not very familiar with what is possible with REST as all of my experience is with just plain SPL.
In a word, how do we stream events across multiple, dynamically generated time intervals without running into subsearch limitations?
You can generate multiple ranges with a subsearch. For example:
index=_internal [| makeresults count=10
| streamstats count as cnt
| eval earliest=now()-cnt*60
| eval latest=now()-cnt*60+15
| table earliest latest ]
| timechart span=10s count
The only problem here is that you can't easily order the series. You probably get around it somehow but it's not that straightforward.
The upside is that it uses just one base search and one subsearch.
I meant to accept your post I replied to as solution, but accidentally hit my reply instead. Can I fix this?
You can click "not a solution" and click other post as a solution. But no worries 🙂
You can generate multiple ranges with a subsearch. For example:
index=_internal [| makeresults count=10
| streamstats count as cnt
| eval earliest=now()-cnt*60
| eval latest=now()-cnt*60+15
| table earliest latest ]
| timechart span=10s count
The only problem here is that you can't easily order the series. You probably get around it somehow but it's not that straightforward.
The upside is that it uses just one base search and one subsearch.
Just wanted to come back to this in case anyone else reads this.
Although this solution works , it doesn't appear to be much faster than if you searched the entire timeframe between the absolute earliest and absolute latest.
For instance, if you were searching a year but only needed to sample a few sparse holidays, I believe the performance is closer to searching the whole year than running separate searches for the holidays only.
That is possible. You would get most impact if your generated sum of time segments caused splunk to ignore whole buckets vs. the whole time range. So obviously there would be much less I/O overhead, memory use, opened files and so on. With search like "(earliest=-10m latest=-9m) OR (earliest=-8m latest=-7m)" vs. "(earliest=-10m latest=-5m)"... I wouldn't expect much improvement. It also depends on the data and search itself so YMMV.
That's wild. I always assumed earliest and latest were static parameters, and nothing in the documentation seems to suggest you can do something like (earliest=X AND latest=Y) OR (earliest=A AND latest=B).
Is there some general concept I am missing on how Splunk parses parameters which should make this obvious? Or is it just specific to this one thing?
Like you mentioned you lose the series labeling you get for free with multisearch, but you could patch this up with an eval although I'd imagine the performance suffers. But still this is a great, relatively simple solution!
To be honest, I was a bit surprised myself. Considering that _time is treated a bit differently during search I also assumed at first that you can have just one fixed range for your search. But then I thought, as my colleague used to say - try and see. So I tried and saw 🙂
The issue of number of searches might not seem important at first but if you have more of those ranges you might hit your server's limits.