Solved: Splunk Enterprise Security: How to configure datam...

koshyk · ‎03-31-2016

We are using datamodel_summary heavily for Splunk Enterprise Security and its quite slow in datamodel acceleration. Are there any good practices to speed up this acceleration from a design point of view? Would below suggestions help?

- Use separate Filesystem for tstatsHomePath
- Use SSD disk for tstatsHomePath filesystem
- Use max_time and backfill_time in datamodels.conf if possible

Any other suggestions would be very helpful.

mcronkrite · ‎04-02-2016

Yes, using SSD will absolutely help overall performance.
Your indexes should be volume mananged such that the Hot Warm path should pretty much always be the fastest storage possible, and sized for your most common search ranges, and data model acceleration time ranges.

Yes, checking out the max_time and backfill_time, these can also be limited to your specific time ranges for your use cases, like 7 days or 30 days.
http://docs.splunk.com/Documentation/ES/4.1.0/Install/Datamodels

Some other easy performance tweaks:

Upgrade to the latest CIM App.
Go to Manage Apps - Splunk_TA_CIM - Setup
For the data models that you use, e.g. Authentication, Web. Add the specific indexes that are related to the data.

In Enterprise Security, check out Audit - Data Model Audit, and see which ones are taking a long time.
Change the retention time for large models to reduce the size. Usually 7 days or 30 days is sufficient for most use cases.
And you can also go to Settings - Data Inputs and set the modular input to turn off Enforce Acceleration.

Finally, If you are seeing some that are taking a long time to build, then you might be running a lot of correlation searches at the same time, which the data model is trying to finish. This is usually because of a CPU bottleneck. https://wiki.splunk.com/Community:TroubleshootingSearchQuotas

Look at max_searches_per_cpu settings in limits.conf
This is where usually the bottle neck is check if you have more cores that can be utilized, and have the splunk user get more searches per cpu.

http://docs.splunk.com/Documentation/Splunk/latest/Admin/Limitsconf
base_max_searches =
* A constant to add to the maximum number of searches, computed as a multiplier
of the CPUs.
*** Defaults to 6**

max_searches_per_cpu =
* The maximum number of concurrent historical searches per CPU. The system-wide
limit of historical searches is computed as:
max_hist_searches = max_searches_per_cpu x number_of_cpus + base_max_searches
* Note: the maximum number of real-time searches is computed as:
max_rt_searches = max_rt_search_multiplier x max_hist_searches
* Defaults to 1

More tweaks, make sure you have setup your environment correctly, by disabling Transparent Huge Pages http://docs.splunk.com/Documentation/Splunk/6.2.5/ReleaseNotes/SplunkandTHP

View solution in original post

mcronkrite · ‎04-02-2016

Yes, using SSD will absolutely help overall performance.
Your indexes should be volume mananged such that the Hot Warm path should pretty much always be the fastest storage possible, and sized for your most common search ranges, and data model acceleration time ranges.

Yes, checking out the max_time and backfill_time, these can also be limited to your specific time ranges for your use cases, like 7 days or 30 days.
http://docs.splunk.com/Documentation/ES/4.1.0/Install/Datamodels

Some other easy performance tweaks:

Upgrade to the latest CIM App.
Go to Manage Apps - Splunk_TA_CIM - Setup
For the data models that you use, e.g. Authentication, Web. Add the specific indexes that are related to the data.

In Enterprise Security, check out Audit - Data Model Audit, and see which ones are taking a long time.
Change the retention time for large models to reduce the size. Usually 7 days or 30 days is sufficient for most use cases.
And you can also go to Settings - Data Inputs and set the modular input to turn off Enforce Acceleration.

Finally, If you are seeing some that are taking a long time to build, then you might be running a lot of correlation searches at the same time, which the data model is trying to finish. This is usually because of a CPU bottleneck. https://wiki.splunk.com/Community:TroubleshootingSearchQuotas

Look at max_searches_per_cpu settings in limits.conf
This is where usually the bottle neck is check if you have more cores that can be utilized, and have the splunk user get more searches per cpu.

http://docs.splunk.com/Documentation/Splunk/latest/Admin/Limitsconf
base_max_searches =
* A constant to add to the maximum number of searches, computed as a multiplier
of the CPUs.
*** Defaults to 6**

max_searches_per_cpu =
* The maximum number of concurrent historical searches per CPU. The system-wide
limit of historical searches is computed as:
max_hist_searches = max_searches_per_cpu x number_of_cpus + base_max_searches
* Note: the maximum number of real-time searches is computed as:
max_rt_searches = max_rt_search_multiplier x max_hist_searches
* Defaults to 1

More tweaks, make sure you have setup your environment correctly, by disabling Transparent Huge Pages http://docs.splunk.com/Documentation/Splunk/6.2.5/ReleaseNotes/SplunkandTHP

niemesrw · ‎04-02-2016

Have you also tuned the DM constraints? By default they usually just use tags, we've constrained by index and in some cases sourcetype.

koshyk · ‎04-02-2016

thanks for your reply. How do you do constraint by index/sourcetype? Do you indvidually amend Splunk_TA_CIM's queries and put index=xyz? an example would be very helpful. cheers

niemesrw · ‎04-04-2016

That's exactly what I do. There is a set of macros you can use in the latest version of the CIM app, but I don't use it and instead simply hard-code the constraints.

So for example in the Application State datamodel, I have the following:

(index=os OR index=sos OR index=windows) AND (tag=listening tag=port)

mcronkrite · ‎04-02-2016

get the latest CIM app, and then go to setup, it has a gui there to limit the indexes

ekost · ‎04-04-2016

Constraining by index is cool, but the admin is on the hook for for updating the index constraints when a new index with applicable data for that data model is added. If you're good at documenting and maintenance, it's a great choice.

Splunk Enterprise Security: How to configure datamodel_summary effectively for performance?

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Splunk MCP & Agentic AI: Machine Data Without Limits

Join the Conversation

Splunk Enterprise Security: How to configure datamodel_summary effectively for performance?

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Splunk MCP & Agentic AI: Machine Data Without Limits