Deployment Architecture

Datamodel Acceleration Consuming High Memory at certain times of the day

nh2017
Observer

Hi Everyone,

Our environment consists of an indexer cluster and independent SHs. ES runs on a single SH. We are seeing memory usage spikes on indexer at certain times of the day/night. There is no consistency or pattern to this. Resource usage drops after a few hour usually without much intervention. Sometimes a peer is considered "down" when there is excessive memory and cpu usage on that peer. When this happens, the cluster tries to recover which causes a lot of unnecessary "bucket fixup". We have not upgraded the servers recently or updated ES.  I can provide more details based on your questions. Here are a few observations:

1. When the memory spikes on indexers, there are multiple executions of the datamodel accelerations running during the same instant (referring to the _time). Count is 2 or 3. Max concurrency for datamodels is set to 3.  At other times (when memory usage is low), only 1 execution is seen.  Please see below for clarification of the count I am referring to:

Screen Shot 2021-03-04 at 5.47.17 PM.png

 

2.  On some days, search concurrency in the cluster was too high (over 200). Am working on reducing the number of concurrent searches allowed on SH and available to scheduled searches. But this is also not consistent. For example, we did not have that many concurrent users or searches in the environment but we still had high memory usage across indexers. 

Any help or insight would be appreciated. Working with support as well but it's unclear why the datamodels suddenly push the indexers to use over 80% of memory. Our machines are over-provisioned for the most part. 

For example, the acceleration. that normally takes less than 3G would suddenly take over 5G or 9G of memory

 

Thanks!

 

 

Labels (2)
0 Karma

richgalloway
SplunkTrust
SplunkTrust

Memory use is related to the number of events processed by the datamodel.  Is it possible the periods of high utilization are during times of increased data ingestion?  That would mean the DM has to process more events and therefore use more memory.

---
If this reply helps you, Karma would be appreciated.
0 Karma

nh2017
Observer

Hi @richgalloway ,

On some days, there was an increase in indexing rate prior to the DM concurrent accelerations being kicked off.  On other days, ingestion went down significantly just before we got the high memory alert.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...