Splunk Search

Why Splunk search jobs stuck at "finalizing job" status for few days?

sdubey_splunk
Splunk Employee
Splunk Employee

Symptoms:

  • It usually happen in the next couple of hours after we manually deleted the stuck search jobs

  • It only happens to particular searches

  • Once it happens, how long the affected search is stuck with the status of "finalizing"?

  • The affected search jobs is stuck with the status of "finalizing" until we manually deleted them.

  • Job status says 100% completed according to job inspection(when viewed via "Inspet Job")

Tags (1)
0 Karma
1 Solution

sdubey_splunk
Splunk Employee
Splunk Employee

Solution:
1. bump up max_chunk_queue_size in limits.conf in the search heads.
That will reduce the necessity for pausing search result collation queues, which makes hitting the issue less likely

limit.conf : Updated below parameters to fix the issue.

[search]
result_queue_max_size = 200000000
max_chunk_queue_size = 5000000
fetch_remote_search_log = false

remote_timeline_fetchall = false

Details about above paramters.

result_queue_max_size =
* The maximum size, in MB, that will be kept from peers for processing on
the search head before throttling the rate that data is accepted.
* The “results_queue_min_size” value takes precedence. The number of search
results chunks specified by “results_queue_min_size” will always be
retained in the queue even if the combined size in MB exceeds the
“result_queue_max_size” value.
* Default: 100

max_chunk_queue_size =
* The maximum size of the chunk queue.
* default: 10000000

Updating above parameters reduced the necessity for pausing search result collation queues, which makes hitting the issue less likely. And this fixed our issue.

View solution in original post

0 Karma

sdubey_splunk
Splunk Employee
Splunk Employee

Solution:
1. bump up max_chunk_queue_size in limits.conf in the search heads.
That will reduce the necessity for pausing search result collation queues, which makes hitting the issue less likely

limit.conf : Updated below parameters to fix the issue.

[search]
result_queue_max_size = 200000000
max_chunk_queue_size = 5000000
fetch_remote_search_log = false

remote_timeline_fetchall = false

Details about above paramters.

result_queue_max_size =
* The maximum size, in MB, that will be kept from peers for processing on
the search head before throttling the rate that data is accepted.
* The “results_queue_min_size” value takes precedence. The number of search
results chunks specified by “results_queue_min_size” will always be
retained in the queue even if the combined size in MB exceeds the
“result_queue_max_size” value.
* Default: 100

max_chunk_queue_size =
* The maximum size of the chunk queue.
* default: 10000000

Updating above parameters reduced the necessity for pausing search result collation queues, which makes hitting the issue less likely. And this fixed our issue.

0 Karma

gjanders
SplunkTrust
SplunkTrust

Great question/answer, please accept your answer unless your waiting for alternative answers!

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...