Hello,
I am trying to optimize my infrastructures datamodels. I am following this guide from Lantern:
Optimizing data model acceleration for better performance
Backfill Range
By default it was set to be "Match Summary Range" (7 Days) with a scheduled cron to run every 15mins.
From what I understand this can be changed/optimized, as there is no need on every run to backfill 7 days of data.
With this in mind, I changed the values as below:
Accelerate: Yes
Summary Range: 7 Days
Backfill Range: -1200sec
Max Summarization Search Time: Custom --> 1200 (sec)
Maximum Concurrent Summarization Searches: 4
Poll Buckets For Data to Summarize: No
Summarization Period: 14-59/25 * * * *
Automatic Rebuilds: Yes
From that time after, my ESCU rules stopped triggering and searches were veeery slow.
What maybe the root cause?
Automatic Rebuild: Yes ?
Did this setting automatically rebuild the datamodel and kept a backfill of only 1200sec (30mins), thus rules did not trigger effectively?
Any tips for optimizing my datamodels along with this guide?
Thank you in advance.
Christos
As I understand it (but don't get my word for it, might want to double-check it with docs or testing), summary range tells Splunk for how long the accelerated summaries are to be retained (unless they roll out to frozen with the buckets - it can happen if you have short retention period or low space limits!) and backfill range tells it for how long is the summary to be (re)built at each summary-building search run.
So when you enable DAS with summary range of 28 days and backfil of just one day, unless you manually build more summaries "backward" (which I have never done manually), initially your DAS would cover only one day of historic data. With subsequent updates you'd have more and more data as the more current data would be summarized into DAS but Splunk would not create the summaries beyond the initial "one day back" threshold unless you triggered it manually.
After 27 days you'd have your full 28 days of DAS filled.
Hello @PickleRick.,
thanks for your reply. Yes I understand that and it seems logical.
My question is:
In an already built datamodel of a summary range of 1 week, if I change the backfill range to let's say 30mins, on the next re-run it will start pulling data only from last 30mins, is that correct?
If yes, then if "Automatic Rebuild" is enabled, will this trigger a rebuilt of my DM?
This apparently happened (DM was rebuilt) and I am trying to understand what was the root cause as this affected ESCU triggering.
Hope my explation is clear enough!
Thanks a lot.
Christos
Again - that's only my educated guess but if you had a datamodel with acceleration set to range=1d, backfill=1d and someone changed the backfill to 30m subsequent searches would refresh only the last 30 minutes - previous data would simply roll with the buckets as it aged.
But if at this point someone requested that Splunk rebuilds the summaries, it would build only the last 30 minutes worth of acceleration.
So of course if you have detections based on datamodel tstats searches with summariesonly=t you won't get any results past the immediate backfill range until it will have propagated "backward" with time.