All Apps and Add-ons

RT search creating extremely large (20G+) dispatch tmp artifacts on local file system despite SH_POOL used.

the_wolverine
Champion

Our SH_POOL is set up correctly, it works and SH_POOL is populated properly from all the Splunk happenings. However, when I launch the SOS distributed indexing performance app (version 3.1.0), the article in LOCAL/var/run/splunk/dispatchtmp/ gets extremely big and causes us to run out of disk space on this SH.

Why is anything being written to local? It should be written to SH_POOL.

1 Solution

hexx
Splunk Employee
Splunk Employee

Hi there, Tina!

Why is anything being written to local in a POOLING scenario?

This is a search optimization. When a search operator needs to use a temporary on-disk back store, it leverages dispatchtmp, which is always on the local disk, instead of the dispatch directory which is on NFS in the case of a search-head pool.

We discovered that this large artifact was caused by SoS app version 3.1 (and prior?), the RT nature of the "Real-time measured indexing rate and latency" panel (the top panel in the Distributed Indexing Performance view) caused an extremely huge dispatch artifact to be created if the user allowed the panel to continue to run.

Indeed, this is due to the nature of this search which attempts to assess the indexing latency and throughput rate of all incoming data. This means that we have to do a couple of things that can be very expensive in large scale deployments:

  • Search all the data with "index=* OR index=_*"
  • Use an open-ended real-time window without any constraints on _time, with the "real-time (all time)" time range

That is precisely why we modified this view to no longer run this search on load but to warn the user of the risk of running it for a long time, and only running if the user explicitly wants it.

View solution in original post

hexx
Splunk Employee
Splunk Employee

Hi there, Tina!

Why is anything being written to local in a POOLING scenario?

This is a search optimization. When a search operator needs to use a temporary on-disk back store, it leverages dispatchtmp, which is always on the local disk, instead of the dispatch directory which is on NFS in the case of a search-head pool.

We discovered that this large artifact was caused by SoS app version 3.1 (and prior?), the RT nature of the "Real-time measured indexing rate and latency" panel (the top panel in the Distributed Indexing Performance view) caused an extremely huge dispatch artifact to be created if the user allowed the panel to continue to run.

Indeed, this is due to the nature of this search which attempts to assess the indexing latency and throughput rate of all incoming data. This means that we have to do a couple of things that can be very expensive in large scale deployments:

  • Search all the data with "index=* OR index=_*"
  • Use an open-ended real-time window without any constraints on _time, with the "real-time (all time)" time range

That is precisely why we modified this view to no longer run this search on load but to warn the user of the risk of running it for a long time, and only running if the user explicitly wants it.

the_wolverine
Champion

Thanks, Octavio! I will use your response to convince the team that we need to allocate more storage to our SHs.

the_wolverine
Champion
  1. Why is anything being written to local in a POOLING scenario?

  2. We discovered that this large artifact was caused by SoS app version 3.1 (and prior?), the RT nature of the "Real-time measured indexing rate and latency" panel (the top panel in the Distributed Indexing Performance view) caused an extremely huge dispatch artifact to be created if the user allowed the panel to continue to run.

In version 3.2, the rt was removed and a Run button was added in addition to the following disclaimer:

Caution: This search can be resource intensive and should not run indefinitely. Use the search controls on the right to cancel, pause, or finalize the search

Get Updates on the Splunk Community!

.conf24 | Day 0

Hello Splunk Community! My name is Chris, and I'm based in Canberra, Australia's capital, and I travelled for ...

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

 (view in My Videos)Struggling with alert fatigue, lack of context, and prioritization around security ...

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...