I’m running into an unexpected behavior with the Network_Traffic datamodel.
Here’s the configuration:
allow_old_summaries = true
allow_skew = 0
backfill_time = -300s
cron_schedule = 2-59/5 * * * *
earliest_time = -2h
hunk.compression_codec = -
hunk.dfs_block_size = 0
hunk.file_format = -
manual_rebuilds = true
max_concurrent = 3
max_time = 14400
poll_buckets_until_maxtime = false
schedule_priority = higher
workload_pool = -
According to the settings, I would expect the accelerated summaries to be limited to a 2-hour window (earliest_time = -2h), but when I query the datamodel I still see events much older than that in fact, some are even 1000+ days old.
From what I understand:
Have you ever experienced this issue?
Could this be related to backfill behavior, the allow_old_summaries = true setting, or perhaps the way the datamodel was originally accelerated?
Any insight would be very helpful.
Hi Prewin,
the actual summary range is as below:
Regards,
Antonio
@antoniomarongiu
Is your back fill range also same as summary range?
In that case make your
allow_old_summaries false and rebuild and test
earliest_time controls how far back the summarization search runs each cycle, but it does not automatically purge older summary data once it exists. Because you have allow_old_summaries = true
Also what's your summary range settings?(This is the actual retention horizon for summaries)
Regards,
Prewin
If this answer helped you, please consider marking it as the solution or giving a Karma. Thanks!
Hello Prewin,
after changing allow_old_summaries = false now I have "only" 2 day of events against 2 hours configured in earliest , need to follow up the analysis.
Best Regards,
Antonio