I have an accelerated search which is set for a 3 months time range.
The acceleration works, I can get a whole day's logs in the past in an average of 10 seconds, where it would take forever otherwise.
I need to be able to see the data for all the same day of the week. But, since you can't specify a time range before an accelerated search query, you can't use "datewday=Thursday".
And doing this:
| savedsearch "mysavedsearchname" | date_wday=Thursday
won't help since it will force the acceleration to get all the records for the whole week so as to filter them afterward. This results in again an extremely lengthy search. My experiments show that the time it takes for acceleration increases exponentially with the time range you are
looking at. Here is a little table to give you an idea of what I mean:
Days search time
So, as I need to look at all the Thursdays for the last 6 weeks, I end up with a search that takes more than an hour to complete.
Any suggestion on how to get this working will be very appreciated.
Here is my answer regarding martin_mueller & lguinn's requests for the exact searches:
Actually I disagree. The principle should be applicable to any searches, it should not be dependent on my specific search. I have a search that's accelerated. I want to get the accelerated data only for a specific week day, say Thursday (this means all the Thursdays), in the past 6 weeks. And as I said earlier, the way I understand the usage of accelerated searches, you can't do this without looking at the whole 6 weeks worth of data. Unfortunately this nulls the value of accelerated reports.
But just to make you happier:
Accelerated search (3 months) - Name: accmetricps4createaccountallhistory
index=apache uri="*/user/accounts.json" method=POST | bin time span=1m | rex field=raw "(?
six weeks expected data for next day of the week:
| savedsearch accmetricps4createaccountallhistory
[search earliest=-1s | head 1 | eval datewday=strftime(relativetime(now(), "+1d@d"), "%A") | fields datewday | format]
| eval lat=round(Latency,2) | eval tot=round(Total) | eval succ=round(100-(Fail/Total*100),1) | eval _time=strptime(strftime(relativetime(now(), "+1d@d"), "%m/%d/%Y").strftime(time,":%H:%M:%S"), "%m/%d/%Y:%H:%M:%S") | bucket _time span=1h
| stats max(lat) as LATENCYMAX100, perc99(lat) as LATENCYMAX99, perc98(lat) as LATENCYMAX98, perc97(lat) as LATENCYMAX97, perc95(lat) as LATENCYMAX95, perc90(lat) as LATENCYMAX90, perc80(lat) as LATENCYMAX80, perc70(lat) as LATENCYMAX70, perc30(lat) as LATENCYMIN30, perc20(lat) as LATENCYMIN20, perc10(lat) as LATENCYMIN10, perc5(lat) as LATENCYMIN5, perc3(lat) as LATENCYMIN3, perc2(lat) as LATENCYMIN2, perc1(lat) as LATENCYMIN1, min(lat) as LATENCYMIN0, stdevp(lat) as LATENCYSTDDEV, max(tot) as TOTALMAX100, perc99(tot) as TOTALMAX99, perc98(tot) as TOTALMAX98, perc97(tot) as TOTALMAX97, perc95(tot) as TOTALMAX95, perc90(tot) as TOTALMAX90, perc80(tot) as TOTALMAX80, perc70(tot) as TOTALMAX70, perc30(tot) as TOTALMIN30, perc20(tot) as TOTALMIN20, perc10(tot) as TOTALMIN10, perc5(tot) as TOTALMIN5, perc3(tot) as TOTALMIN3, perc2(tot) as TOTALMIN2, perc1(tot) as TOTALMIN1, min(tot) as TOTALMIN0, stdevp(tot) as TOTALSTDDEV, max(succ) as SUCCESSMAX100, perc99(succ) as SUCCESSMAX99, perc98(succ) as SUCCESSMAX98, perc97(succ) as SUCCESSMAX97, perc95(succ) as SUCCESSMAX95, perc90(succ) as SUCCESSMAX90, perc80(succ) as SUCCESSMAX80, perc70(succ) as SUCCESSMAX70, perc30(succ) as SUCCESSMIN30, perc20(succ) as SUCCESSMIN20, perc10(succ) as SUCCESSMIN10, perc5(succ) as SUCCESSMIN5, perc3(succ) as SUCCESSMIN3, perc2(succ) as SUCCESSMIN2, perc1(succ) as SUCCESSMIN1, min(succ) as SUCCESSMIN0, stdevp(succ) as SUCCESSSTDDEV by _time | collect marker="bwmetricps4createaccountall_expected"
This "expected" search takes almost 2 hours to complete. However I have devised a new technique which doesn't use the accelerated reports, and yet gets me the same results in 20 to 30 minutes.
But I still would like to know if there is something I am missing here. Thank you very much for your interest and suggestions.
After numerous experiments I found a technique that solves this problem.
1. Create your accelerated search
2. Schedule the acceleration time range for the amount of time that you know you will look back
3. Create another saved search in which the query is exactly the same as that in the accelerated search, except that you can add subsequent data processing, So add something like "| collect marker="eventshistoryd1" to write the data to the summary index.
4. For this secondary search, set the time range to something relative such as: Start time: -6d@d End time: -7d@d
5. Do NOT check "Accelerate this search"
6. Schedule your search so that it runs at close to midnight.
8. Create other saved searches for days -13d@d, -19d@d for whatever number of past weeks you are interested in. All searches can be scheduled for the exact same time.
9. Then in your dashboard use a search that pulls the data written to the summary index.
10. List item
The trick here is that the secondary saved searches, which are set with relative time range, will use the acceleration correctly and return the data for the selected past days very quickly, taking only 30 to 60 seconds.
Having now seen the accelerated search, I'd say it's slow because you're keeping three months' worth of to-the-minute data - that's a lot of rows to process, considering the final search throws the resolution away and only goes by hours.
A minor thought, move the
eval(response_time/1000000) to after the stats. That way you'll divide each minute once rather than each event once. That won't drastically change your speeds, but still... similarly, you could forego the
count(eval(NOT status=201)) as "Fail" and compute that by
Total-Succ after the stats.
The original accelerated query when executed for the last 24 hours returns all the data in 6 seconds. But it won't work that way if you try to look at the data from a day 6 weeks ago because you can't request that without loosing the splunk recognition that it's dealing with an accelerated search. Furthermore, as I indicated earlier, all my experiments show that accelerated search performance is not linear. So, pulling all the data from 6 weeks ago takes more than 1.5 hour. The real problem is the inability to specify a time range in the query that retrieves the result.
However as indicated in my own answer below, I found a way around this issue.
Also, contrarily to what anybody might assume, the 1 minute resolution doesn't get lost by putting the data into 1 hour bins. What happens is that each record gets the _time minute variable set to 0. And the percentile sill processes all the 1 minute records and gets the percentile value over all the 60 records for each hour.
But you are right that I could have removed the Fail part. I simply forgot to clean it out. Thanks for bringing that to my attention.