We are using datamodel_summary heavily for Splunk Enterprise Security and its quite slow in datamodel acceleration. Are there any good practices to speed up this acceleration from a design point of view? Would below suggestions help?
- Use separate Filesystem for tstatsHomePath
- Use SSD disk for tstatsHomePath filesystem
- Use maxtime and backfilltime in datamodels.conf if possible
Any other suggestions would be very helpful.
Have you also tuned the DM constraints? By default they usually just use tags, we've constrained by index and in some cases sourcetype.
thanks for your reply. How do you do constraint by index/sourcetype? Do you indvidually amend SplunkTACIM's queries and put index=xyz? an example would be very helpful. cheers
get the latest CIM app, and then go to setup, it has a gui there to limit the indexes
Constraining by index is cool, but the admin is on the hook for for updating the index constraints when a new index with applicable data for that data model is added. If you're good at documenting and maintenance, it's a great choice.
That's exactly what I do. There is a set of macros you can use in the latest version of the CIM app, but I don't use it and instead simply hard-code the constraints.
So for example in the Application State datamodel, I have the following:
(index=os OR index=sos OR index=windows) AND (tag=listening tag=port)
Yes, using SSD will absolutely help overall performance.
Your indexes should be volume mananged such that the Hot Warm path should pretty much always be the fastest storage possible, and sized for your most common search ranges, and data model acceleration time ranges.
Yes, checking out the maxtime and backfilltime, these can also be limited to your specific time ranges for your use cases, like 7 days or 30 days.
Some other easy performance tweaks:
Upgrade to the latest CIM App.
Go to Manage Apps - SplunkTACIM - Setup
For the data models that you use, e.g. Authentication, Web. Add the specific indexes that are related to the data.
In Enterprise Security, check out Audit - Data Model Audit, and see which ones are taking a long time.
Change the retention time for large models to reduce the size. Usually 7 days or 30 days is sufficient for most use cases.
And you can also go to Settings - Data Inputs and set the modular input to turn off Enforce Acceleration.
Finally, If you are seeing some that are taking a long time to build, then you might be running a lot of correlation searches at the same time, which the data model is trying to finish. This is usually because of a CPU bottleneck. https://wiki.splunk.com/Community:TroubleshootingSearchQuotas
Look at maxsearchesper_cpu settings in limits.conf
This is where usually the bottle neck is check if you have more cores that can be utilized, and have the splunk user get more searches per cpu.
* A constant to add to the maximum number of searches, computed as a multiplier
of the CPUs.
*** Defaults to 6**
* The maximum number of concurrent historical searches per CPU. The system-wide
limit of historical searches is computed as:
maxhistsearches = maxsearchespercpu x numberofcpus + basemaxsearches
* Note: the maximum number of real-time searches is computed as:
maxrtsearches = maxrtsearchmultiplier x maxhist_searches
* Defaults to 1
More tweaks, make sure you have setup your environment correctly, by disabling Transparent Huge Pages http://docs.splunk.com/Documentation/Splunk/6.2.5/ReleaseNotes/SplunkandTHP