There many reports of high CPU or memory utilization on the indexers after upgrading Spunk Enterprise Security (ES) to version 4.7.0 or later. Most users claim that it started happening after upgrading to ES, version 4.7.x and it was working just fine prior to upgrade.
It is caused by default out of box settings in the ES data model configurations. After a fresh installation or upgrade to 4.7.x , users need to adjust ES data model settings according to their envionrment and business needs.
By default, all ES data models are configured to search across ALL indexes, which will result in extremely high memory utilization at the indexers.
This is also documented in Splunk manual on
http://docs.splunk.com/Documentation/ES/4.7.2/Install/Datamodels#Constrain_data_model_searches_to_sp...
it states:
"You can constrain the indexes searched by a data model to improve performance. By default, data model acceleration searches search all indexes, which can lead to high memory consumption on indexers"
audit.log can show inefficient searches are running across all indexes "((()))"
e.g.
search='summarize tstats=t action=probe id=DM_Splunk_SA__
CIM_Change_Analysis normid= [ search (index=* OR index=_) *((()) tag=change)
audit.log.
09-08-2017 22:01:31.971 -0400 INFO AuditLogger - Audit:[timestamp=09-08-2017 22:01:31.971, user=splunk-system-user, action=search, info=granted , search_id='SummaryDirector_1504922491.13494', search='summarize tstats=t action=probe id=DM_Splunk_SA_CIM_Vulnerabilities normid= [ search (index=* OR index=_) *((()) tag=vulnerability tag=report) ...
Good search should be without "((())"
09-10-2017 12:46:02.573 -0400 INFO AuditLogger - Audit:[timestamp=09-10-2017 12:46:02.573, user=splunk-system-user, action=search, info=granted , search_id='scheduler_nobody_U3BsdW5rX1NBX0NJTQRMD5cefc72a72dd5ee92_at_1505061960_7217', search='| summarize tstats=t override=partial manual_rebuilds=t max_time=3600 poll_buckets_until_maxtime=f id=DM_Splunk_SA_CIM_Vulnerabilities [ search (index=* OR index=*) (((index="whois" OR index="wineventlog")) tag=vulnerability tag=report) ...
How to fix?
In ES app, navigate to Configure -> CIM setup.
Select any data model -> under the indexes tab, select indexes that are used by this particular datamodel.
How to find the list of indexes searched by this data model?
Run tstats search against data models by index
e.g.
| tstats count from datamodel=Malware by index
| tstats count from datamodel=Authentication by index
| tstats count from datamodel=Application_State by index
| tstats count from datamodel=Network_Traffic by index
3.Save changes.
4.Navigate to Settings -> Data Models > rebuild data model.
5.Check in audit.log
searches should like similar to this, without "((())"
id=DM_Splunk_SA_CIM_Vulnerabilities [ search (index=* OR index=*) (((index="whois" OR index="wineventlog")) tag=vulnerability tag=report)
6.Repeat the same steps with all other ES data models.
It is caused by default out of box settings in the ES data model configurations. After a fresh installation or upgrade to 4.7.x , users need to adjust ES data model settings according to their envionrment and business needs.
By default, all ES data models are configured to search across ALL indexes, which will result in extremely high memory utilization at the indexers.
This is also documented in Splunk manual on
http://docs.splunk.com/Documentation/ES/4.7.2/Install/Datamodels#Constrain_data_model_searches_to_sp...
it states:
"You can constrain the indexes searched by a data model to improve performance. By default, data model acceleration searches search all indexes, which can lead to high memory consumption on indexers"
audit.log can show inefficient searches are running across all indexes "((()))"
e.g.
search='summarize tstats=t action=probe id=DM_Splunk_SA__
CIM_Change_Analysis normid= [ search (index=* OR index=_) *((()) tag=change)
audit.log.
09-08-2017 22:01:31.971 -0400 INFO AuditLogger - Audit:[timestamp=09-08-2017 22:01:31.971, user=splunk-system-user, action=search, info=granted , search_id='SummaryDirector_1504922491.13494', search='summarize tstats=t action=probe id=DM_Splunk_SA_CIM_Vulnerabilities normid= [ search (index=* OR index=_) *((()) tag=vulnerability tag=report) ...
Good search should be without "((())"
09-10-2017 12:46:02.573 -0400 INFO AuditLogger - Audit:[timestamp=09-10-2017 12:46:02.573, user=splunk-system-user, action=search, info=granted , search_id='scheduler_nobody_U3BsdW5rX1NBX0NJTQRMD5cefc72a72dd5ee92_at_1505061960_7217', search='| summarize tstats=t override=partial manual_rebuilds=t max_time=3600 poll_buckets_until_maxtime=f id=DM_Splunk_SA_CIM_Vulnerabilities [ search (index=* OR index=*) (((index="whois" OR index="wineventlog")) tag=vulnerability tag=report) ...
How to fix?
In ES app, navigate to Configure -> CIM setup.
Select any data model -> under the indexes tab, select indexes that are used by this particular datamodel.
How to find the list of indexes searched by this data model?
Run tstats search against data models by index
e.g.
| tstats count from datamodel=Malware by index
| tstats count from datamodel=Authentication by index
| tstats count from datamodel=Application_State by index
| tstats count from datamodel=Network_Traffic by index
3.Save changes.
4.Navigate to Settings -> Data Models > rebuild data model.
5.Check in audit.log
searches should like similar to this, without "((())"
id=DM_Splunk_SA_CIM_Vulnerabilities [ search (index=* OR index=*) (((index="whois" OR index="wineventlog")) tag=vulnerability tag=report)
6.Repeat the same steps with all other ES data models.
Additional note:
By default data models are configured to run three concurrent acceleration instances per data model, which can contribute to more resource usage at the indexers.
In order to reduce the resources utilization, reduce number of concurrent searches to one.
In ES app, navigate to -> Configure -> CIM setup -> Settings tab
For every accelerated data model, change it
From
acceleration.max_concurrent = 3
To:
acceleration.max_concurrent = 1
In order to reduce the load,