Knowledge Management

How to leverage summary indexes so that it can summarize the data and then dump it?

Kendo213
Communicator

We have a virtualization index with no restrictions currently as far as hot/warm/cold. After about 4 months we're sitting at 16GB indexed per day (average) with 1.8 TB (compressed) on disk and searchable. I'm proposing that we set a hard cap on this, as I don't believe keeping all of the data around is useful.

I'm looking to leverage summary indexes so that I can somehow summarize the data and then dump it. For example, grab the average CPU/memory usage and dump it to a summary index, but not to keep the source data around for long. I do see how I can create a saved search that outputs to a summary index in short timespans (i.e. last hour) however, how would I do this retroactively on 1.8 TB worth of data in chunks so that trends can be seen? If I need to clarify the question let me know.

Thanks

0 Karma

yannK
Splunk Employee
Splunk Employee

1- create your summary searches (searching index A, doing sitimechart or sistats commands to optimize the results, then saving the results in an index B)
see http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Usesummaryindexing
2 - test it on new data, verify that you can retrieve the events, and be happy, schedule it.
3- run the backfill script for the timerange prior to the scheduled summary search, and wait for the jobs to complete
see the backfill script http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Managesummaryindexgapsandoverlaps
it can take some time, so if you have many cores, you can ask the script to spawn 8 parallel jobs to complete faster.
4- profit, and change your retention on index A, as you should not need the original raw data.

0 Karma
Get Updates on the Splunk Community!

Upcoming Webinar: Unmasking Insider Threats with Slunk Enterprise Security’s UEBA

Join us on Wed, Dec 10. at 10AM PST / 1PM EST for a live webinar and demo with Splunk experts! Discover how ...

.conf25 technical session recap of Observability for Gen AI: Monitoring LLM ...

If you’re unfamiliar, .conf is Splunk’s premier event where the Splunk community, customers, partners, and ...

A Season of Skills: New Splunk Courses to Light Up Your Learning Journey

There’s something special about this time of year—maybe it’s the glow of the holidays, maybe it’s the ...