To reduce disk usage on the indexers, I heard that Splunk Shared Data Models can help prevent duplication of the datamodel accelerated file by accelerated datamodels. For example, in a standalone setup, SH#2 can reference the datamodel from SH#1 instead of creating its own.
If we are running a Search Head Cluster (SHC) in the same cluster, can we use this shared datamodel concept to reduce disk usage on the indexers, or is this already handled automatically by default within the cluster?
With SHC there is no need of explicitly sharing anything. Oversimplifying a bit, SHC works like a big SH instance split into multiple machines. So a single SHC always uses the same Datamodel Acceleration Summaries (DAS). There is no need to build separate summaries for different memebers of SHC.
If you have a SHC, the acceleration-building search is scheduled to just one of the members of the SHC but all members will use a summary built this way. There is nothing you need to do over the normal SHC functionality.
You can share DAS between two or more stand-alone SHs, two separate SH clusters, a stand-alone SH and SHC and so on.
As @isoutamo mentioned, if you setup your SHC to send all the data to indexers then all search heads within the cluster utilize the same set of accelerated datamodel summaries stored on the indexers, preventing any duplication of accelerated files for the same datamodel. Splunk manages this automatically.
But if you want one designated search head to generate and store the acceleration summaries, while other members reference that shared data instead of creating their own copies, you can use shared datamodel acceleration.
#https://help.splunk.com/en/splunk-enterprise/manage-knowledge-objects/knowledge-management-manual/9....
Regards,
Prewin
If this answer helped you, please consider marking it as the solution or giving a Karma. Thanks!
I am still confused. From my understanding, in an SHC, if I configure a member to share a datamodel from another member (Same Cluster), will the size of the datamodel summaries stored on the indexers also be reduced?
With SHC there is no need of explicitly sharing anything. Oversimplifying a bit, SHC works like a big SH instance split into multiple machines. So a single SHC always uses the same Datamodel Acceleration Summaries (DAS). There is no need to build separate summaries for different memebers of SHC.
If you have a SHC, the acceleration-building search is scheduled to just one of the members of the SHC but all members will use a summary built this way. There is nothing you need to do over the normal SHC functionality.
You can share DAS between two or more stand-alone SHs, two separate SH clusters, a stand-alone SH and SHC and so on.
If you have configured SHC correctly (send all logs to indexers and no local indexes) then you already have it in place. If not then fix your SHC configuration.