Archive

Why Splunk search jobs stuck at "finalizing job" status for few days?

sdubey_splunk
Splunk Employee
Splunk Employee

Symptoms:

  • It usually happen in the next couple of hours after we manually deleted the stuck search jobs

  • It only happens to particular searches

  • Once it happens, how long the affected search is stuck with the status of "finalizing"?

  • The affected search jobs is stuck with the status of "finalizing" until we manually deleted them.

  • Job status says 100% completed according to job inspection(when viewed via "Inspet Job")

Tags (1)
0 Karma
1 Solution

sdubey_splunk
Splunk Employee
Splunk Employee

Solution:
1. bump up max_chunk_queue_size in limits.conf in the search heads.
That will reduce the necessity for pausing search result collation queues, which makes hitting the issue less likely

limit.conf : Updated below parameters to fix the issue.

[search]
result_queue_max_size = 200000000
max_chunk_queue_size = 5000000
fetch_remote_search_log = false

remote_timeline_fetchall = false

Details about above paramters.

result_queue_max_size =
* The maximum size, in MB, that will be kept from peers for processing on
the search head before throttling the rate that data is accepted.
* The “results_queue_min_size” value takes precedence. The number of search
results chunks specified by “results_queue_min_size” will always be
retained in the queue even if the combined size in MB exceeds the
“result_queue_max_size” value.
* Default: 100

max_chunk_queue_size =
* The maximum size of the chunk queue.
* default: 10000000

Updating above parameters reduced the necessity for pausing search result collation queues, which makes hitting the issue less likely. And this fixed our issue.

View solution in original post

0 Karma

sdubey_splunk
Splunk Employee
Splunk Employee

Solution:
1. bump up max_chunk_queue_size in limits.conf in the search heads.
That will reduce the necessity for pausing search result collation queues, which makes hitting the issue less likely

limit.conf : Updated below parameters to fix the issue.

[search]
result_queue_max_size = 200000000
max_chunk_queue_size = 5000000
fetch_remote_search_log = false

remote_timeline_fetchall = false

Details about above paramters.

result_queue_max_size =
* The maximum size, in MB, that will be kept from peers for processing on
the search head before throttling the rate that data is accepted.
* The “results_queue_min_size” value takes precedence. The number of search
results chunks specified by “results_queue_min_size” will always be
retained in the queue even if the combined size in MB exceeds the
“result_queue_max_size” value.
* Default: 100

max_chunk_queue_size =
* The maximum size of the chunk queue.
* default: 10000000

Updating above parameters reduced the necessity for pausing search result collation queues, which makes hitting the issue less likely. And this fixed our issue.

View solution in original post

0 Karma

gjanders
SplunkTrust
SplunkTrust

Great question/answer, please accept your answer unless your waiting for alternative answers!

0 Karma
.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!