Knowledge Management

Datamodel Acceleration Consistently Reaching Max Summarization Search Time

Trevorator
Explorer

Hello there, 

In our environment we have datamodel accelerations that are consistently reaching the Max Summarization Search Time, which is the default 3600 seconds. We know the issue is related to the resources allocated to the indexing tier as the accelerations are maxing out CPU. It will be remediated, but not immediately. 

What I am interested in finding out is how the limit is implemented, if an acceleration never completes, just times out and starts the next summary, is there the potential for some data to not be accelerated? 

We also currently have searches using summariesonly=t with a time range of -30m, our max concurrent auto summarizations is 2, so I know there can be up to a 55 minute gap in tstats data, meaning the searches could miss events. While not best practice, could setting the max summarization search time to 1800 seconds be a potential solution? 

Thanks for your help!

Labels (2)
0 Karma
1 Solution

Prewin27
Contributor

@Trevorator 

After the initial creation of data model acceleration summaries, Splunk regularly runs scheduled summarization searches to incorporate new data and remove information that is older than the defined summary range.


If a summarization search exceeds the Max Summarization Search Time limit, it is stopped before completing its assigned interval.
Normally, Splunk does not automatically retry or continue the interrupted summarization for that specific time window, which can result in gaps in your accelerated data if summarization searches repeatedly fail or time out.
These gaps mean that some events will not be included in the .tsidx summary files, causing searches that rely on tstats summariesonly=true to miss those events

I would say best approach is to address the resource constraints causing your summarization searches to run too long.

Regards,
Prewin
Splunk Enthusiast | Always happy to help! If this answer helped you, please consider marking it as the solution or giving a kudos/Karma. Thanks!

View solution in original post

livehybrid
Super Champion

Hi @Trevorator 

What are your acceleration.backfill_time and acceleration.earliest_time set to?

Reducing acceleration.max_time from 3600 seconds to 1800 seconds is unlikely to be a solution and may worsen the problem. If a summarization search requires, for example, 2000 seconds to process its assigned time range due to resource constraints, it would complete with a 3600-second timeout but would fail with an 1800-second timeout. This would lead to more frequent timeouts and potentially larger gaps in your accelerated data.

I think the best option is to determine why the search is taking so long to run, is the DM restricted to your only your required set of indexes? 

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

gcusello
SplunkTrust
SplunkTrust

Hi @Trevorator ,

you have two solutions:

delay the time frame: e.g. if you have a delay in acceleration of 5 minutes, you can use in your Correlation Searches as time borders: e.g. from -10m@m to -5m@m instead of from -5m@m to now.

otherwise you can use the option summariesonly=false in your tstats command, so the command also reads the not yet accelerated data, but this solution is obviously less performant than the other.

Ciao.

Giuseppe

Trevorator
Explorer

Hi @gcusello that does make sense for the correlation searches, but I am still interested about the impact to the datamodel acceleration itself. Will there be issues in the tsidx files if the acceleration never fully completes? Or will the next summary pick up where it left off ones it hits the summarization limit. 

If it's the latter, does that mean the most recent data is consistently getting delayed in it's acceleration because each acceleration search needs to catch up on the previous debt?

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Trevorator ,

As @Prewin27 pointed out, if your acceleration queries exceed the maximum time limit, you should analyze why this happens, in other words, what is your storage performance and whether system resources are sufficient.

For storage performances, check if the IOPS value of each storage is greater than 800 using an eternal tool like e.g. Bonnie++ and how many CPUs you have in your indexers and Search Heads, you can check this using the Monitoring Console.

Ciao.

Giuseppe

Prewin27
Contributor

@Trevorator 

After the initial creation of data model acceleration summaries, Splunk regularly runs scheduled summarization searches to incorporate new data and remove information that is older than the defined summary range.


If a summarization search exceeds the Max Summarization Search Time limit, it is stopped before completing its assigned interval.
Normally, Splunk does not automatically retry or continue the interrupted summarization for that specific time window, which can result in gaps in your accelerated data if summarization searches repeatedly fail or time out.
These gaps mean that some events will not be included in the .tsidx summary files, causing searches that rely on tstats summariesonly=true to miss those events

I would say best approach is to address the resource constraints causing your summarization searches to run too long.

Regards,
Prewin
Splunk Enthusiast | Always happy to help! If this answer helped you, please consider marking it as the solution or giving a kudos/Karma. Thanks!

Trevorator
Explorer

@Prewin27 
This is what I was worried was the case. You said that "Normally, Splunk does not automatically retry or continue". Does that mean there is a setting that we could enable to have Splunk do this to ensure there is no loss in .tsidx files in the short term? The goal is to have all data accelerated for enterprise security searches. I know the long term solution is new machines with better iops but it may be some time before they are requisitioned. 

0 Karma

Prewin27
Contributor

@Trevorator 
I dont think there is any Splunk setting to enable automatic retry or continuation for .tsidx file operations. The only way to ensure all data is accelerated and .tsidx files are preserved is to maintain a healthy infrastructure and address any resource limitations.
Regards,
Prewin
Splunk Enthusiast | Always happy to help! If this answer helped you, please consider marking it as the solution or giving a kudos/Karma. Thanks!

0 Karma
Get Updates on the Splunk Community!

Splunk Answers Content Calendar, July Edition I

Hello Community! Welcome to another month of Community Content Calendar series! For the month of July, we will ...

Secure Your Future: Mastering Upgrade Readiness for Splunk 10

Spotlight: The Splunk Health Assistant Add-On  The Splunk Health Assistant Add-On is your ultimate companion ...

Observability Unlocked: Kubernetes & Cloud Monitoring with Splunk IM

Ready to master Kubernetes and cloud monitoring like the pros? Join Splunk’s Growth Engineering team on ...