Splunk Enterprise Security

Can I clean up the datamodel_summary directories that are growing by a couple dozen GB/day?

ericlarsen
Path Finder

We just implemented Splunk Enterprise Security about a month ago. We're new to data models, acceleration, and any implications they may have on our Splunk environment.

I noticed the datamodel_summary directory in our firewall logs index ($SPLUNK_HOME/var/lib/splunk/pan_logs/) is growing incredibly large (850GB and growing a couple dozen GB/day).

I need to understand why. We have the Palo Alto app installed as well and the Palo Alto Networks Firewall Logs datamodel (7 days acceleration) is 100 GB.

In ES, the Network Traffic datamodel (30 days acceleration), part of the Splunk_SA_CIM app, is 300+ GB!

There are approx. 350 dirs in the $SPLUNK_HOME/var/lib/splunk/pan_logs/datamodel_summary dir. Are all of these really necessary, or can I institute some kind of cleanup in this directory to recover space?

Any help in understanding how data models are stored/cleaned up would be greatly appreciated.
Thanks.

0 Karma

lguinn2
Legend

I don't think that you should manually delete any of the data model acceleration files. The size of these files is related to two things: the number of events in the associated index and the number of days acceleration. I am not surprised to find that your data model summary information is quite large.

To fix it, you may want to decrease the number of days acceleration for some (or all) data models. Clearly 30 days acceleration is going to be approximately 4x as large as 7 days acceleration for the same index.

The usual estimate for the size of the data model summary = Inbound data amount (GB or MB) * 3.4
You might want to take a look at this page of the documentation: Accelerate data models

0 Karma

ericlarsen
Path Finder

Thanks for the response.

How did you come up with the "data model summary = Inbound data amount (GB or MB) * 3.4" statement? Wouldn't it depend on the summary range of the data model?

0 Karma

lguinn2
Legend

That calculation is published in the Splunk® Enterprise Security Installation and Upgrade Manual in the section on Data model acceleration storage and retention

I just looked it up and it also says "This formula assumes that you are using the recommended retention rates for the accelerated data models." Here is the link:
http://docs.splunk.com/Documentation/ES/4.5.1/Install/Datamodels

If you are seeing something really different from what the documentation suggests, I think you should file a support ticket. If you just stop accelerating the data models, I am concerned that it might have a negative effect on your Enterprise Security correlation searches and alerts...

0 Karma
Get Updates on the Splunk Community!

Dashboards: Hiding charts while search is being executed and other uses for tokens

There are a couple of features of SimpleXML / Classic dashboards that can be used to enhance the user ...

Splunk Observability Cloud's AI Assistant in Action Series: Explaining Metrics and ...

This is the fourth post in the Splunk Observability Cloud’s AI Assistant in Action series that digs into how ...

Brains, Bytes, and Boston: Learn from the Best at .conf25

When you think of Boston, you might picture colonial charm, world-class universities, or even the crack of a ...