Knowledge Management

How to leverage summary indexes so that it can summarize the data and then dump it?

Kendo213
Communicator

We have a virtualization index with no restrictions currently as far as hot/warm/cold. After about 4 months we're sitting at 16GB indexed per day (average) with 1.8 TB (compressed) on disk and searchable. I'm proposing that we set a hard cap on this, as I don't believe keeping all of the data around is useful.

I'm looking to leverage summary indexes so that I can somehow summarize the data and then dump it. For example, grab the average CPU/memory usage and dump it to a summary index, but not to keep the source data around for long. I do see how I can create a saved search that outputs to a summary index in short timespans (i.e. last hour) however, how would I do this retroactively on 1.8 TB worth of data in chunks so that trends can be seen? If I need to clarify the question let me know.

Thanks

0 Karma

yannK
Splunk Employee
Splunk Employee

1- create your summary searches (searching index A, doing sitimechart or sistats commands to optimize the results, then saving the results in an index B)
see http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Usesummaryindexing
2 - test it on new data, verify that you can retrieve the events, and be happy, schedule it.
3- run the backfill script for the timerange prior to the scheduled summary search, and wait for the jobs to complete
see the backfill script http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Managesummaryindexgapsandoverlaps
it can take some time, so if you have many cores, you can ask the script to spawn 8 parallel jobs to complete faster.
4- profit, and change your retention on index A, as you should not need the original raw data.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...