Solved: SummaryDB Rolling over to Frozen

slierninja · ‎12-15-2012

Our splunk summarydb index is enormous - it has maxed out at the default 500GB and has started moving content into our predefined frozen directory archive. There are over 3 billion events listed.

What is a good size for the summarydb index? Do you ever need to keep the frozen archive? I was afraid of losing the data, but now I'm not so sure we even need it. To my knowledge, we don't even make use of summary index commands (si*). Is it safe to reduce the consumption of 500GB to 200GB or even lower? I don't see the advantage of keeping such a large summary index.

The maximum size of our other indexes (non-summary) are 50GB (they have yet to rollover) - so I'm not sure why the summarydb has such a large footprint when it's supposed to only contain summary information.

This has really only become an issue since we are doing a splunk upgrade to 5.0 and need to backup our index databases per the installation guide steps.

BobM · ‎12-15-2012

The idea of a summary index is it contains a subset of your live data to allow for faster searching. This summary data should only be kept as long as it is useful and should not get larger than the original.

As yours is growing out of control, it seems you have some badly set up summarising searches. First look for searches with "index=summary" in them to see what summarised data you are using. Then look for searches with si commands or the "collect" command in them and see if the data they are collecting is useful to you. If not, disable them. You may also find searches that could be merged. Make sure to check any apps you have installed as well.

Deleting summary data can cause inaccurate reports so care should be taken but in most cases, the wanted results can be rebuilt with the backfill script. See http://docs.splunk.com/Documentation/Splunk/5.0/Knowledge/Managesummaryindexgapsandoverlaps for details.

View solution in original post

BobM · ‎12-15-2012

The idea of a summary index is it contains a subset of your live data to allow for faster searching. This summary data should only be kept as long as it is useful and should not get larger than the original.

As yours is growing out of control, it seems you have some badly set up summarising searches. First look for searches with "index=summary" in them to see what summarised data you are using. Then look for searches with si commands or the "collect" command in them and see if the data they are collecting is useful to you. If not, disable them. You may also find searches that could be merged. Make sure to check any apps you have installed as well.

Deleting summary data can cause inaccurate reports so care should be taken but in most cases, the wanted results can be rebuilt with the backfill script. See http://docs.splunk.com/Documentation/Splunk/5.0/Knowledge/Managesummaryindexgapsandoverlaps for details.

slierninja · ‎12-17-2012

You rock! This helped us find the source problem. When we queried index=summary, we found the offending application that was filling up our summaryDB. For now we've disabled the savedSearch. It seems there was an issue with the scheduled search not having an earliest time defined so all events were getting put into the summaryDB with each scheduled run. Using the Manager->Searches and reports view, we set App Context to "All" so that we could see where our scheduled searches were and verified that they had a proper start time in their Time Range field. Kudos! +1

SummaryDB Rolling over to Frozen

Fueling your curiosity with new Splunk ILT and eLearning courses

Splunk AI Assistant for SPL 1.1.0 | Now Personalized to Your Environment for Greater ...

Unleash Unified Security and Observability with Splunk Cloud Platform