All Apps and Add-ons

Query on Splunk DLTK scalability in training and inferencing

indranilr
Observer

The splunk DLTK 5.1.0 documentation suggests below :

No indexer distributionData is processed on the search head and sent to the container environment. Data cannot be processed in a distributed manner, such as streaming data in parallel from indexers to one or many containers. However, all advantages of search in a distributed Splunk platform deployment still exist.


Does the above imply that data from splunk are not distributed (such as data parallelism) among multiple containers in the Kubernetes execution environment during training or inference phase ?

Further, is the distribution only vertical in nature (multi CPU or multi GPU in a single container) or the jobs can scale horizontally as well (multiple containers) with each container working on a partition of data ?

Further, for executing Tensorflow, PyTorch, Spark or Dask jobs do we need to have required operators/services pre-installed prior to (Spark K8s operator for example) submitting the jobs from Splunk Jupyter notebook ? Or are these services setup during DLTK app installation and configuration in Splunk ?

Appreciate any inputs on above query.

Thanks in advance !

0 Karma
Get Updates on the Splunk Community!

September Community Champions: A Shoutout to Our Contributors!

As we close the books on another fantastic month, we want to take a moment to celebrate the people who are the ...

Splunk Decoded: Service Maps vs Service Analyzer Tree View vs Flow Maps

It’s Monday morning, and your phone is buzzing with alert escalations – your customer-facing portal is running ...

What’s New in Splunk Observability – September 2025

What's NewWe are excited to announce the latest enhancements to Splunk Observability, designed to help ITOps ...