Memory use datamodels

Path Finder

We're seeing massive memory use (20GB+) of the Network_Traffic datamodel acceleration searches.
The limits.conf default max_mem_usage_mb is set to 200 but the tstats search doesn't seem to listen to this. The searches seem to continue for about 62 minutes even though the max_time is set to 3600. Linux often kills the processes for running out of memory (OOM killer).
The splunk version is 7.2.6. and we're using a search peer cluster, I don't see any different max_mem_usage_mb settings on the indexers/search head. How do we ensure that the acceleration searches run fine but don't take 20GB+ memory?

Kind regards,

0 Karma

Splunk Employee
Splunk Employee

Thanks for reporting this, @mmoermans!

max_mem_usage_mb is only used to cache events and result set for a particular search; it is not used to limit the 'splunkd' memory usage. Any result set that is larger than 200MB (default) will spill to disks -- you must be also seeing high disk IO activities and a ton of files that look like /statstmp_partition0_1555718872.35.srs.zst. There is a good chance that a big portion of that 62 minutes is spent in disk io. Your Network_Traffic must be ingesting 1) a huge volume of events and 2) events have high cardinality -- this, by nature, makes acceleration expensive (high memory and CPU cost).

There are a few things you can try to speed things up:

  1. Use index scoping. By default, the data model will look at all indexes that contain a tag (e.g., pci), if by chance, other indexes (that you are not really interested in accelerating as part of Network_Traffic) have the same tag, naturally they will be scanned for acceleration. So if you just scope the data model to the particular index, you have control. Chances are acceleration will be much faster. See screen attached showing where to define index scoping.
  2. Yes, you can try increasing max_mem_usage_mb. On a host with 64GB RAM, 1GB is just 1.5%, a good starting point to try. By spilling less temp results to disks, the memory overhead can also be reduced, possibly lower the total memory usage.

index scoping
Let me know how it goes.

0 Karma
Get Updates on the Splunk Community!

Introducing Edge Processor: Next Gen Data Transformation

We get it - not only can it take a lot of time, money and resources to get data into Splunk, but it also takes ...

Take the 2021 Splunk Career Survey for $50 in Amazon Cash

Help us learn about how Splunk has impacted your career by taking the 2021 Splunk Career Survey. Last year’s ...

Using Machine Learning for Hunting Security Threats

WATCH NOW Seeing the exponential hike in global cyber threat spectrum, organizations are now striving more for ...