Search Head Clustering: Why am I getting a duplica...

BP9906 · ‎05-20-2015

I'm receiving a duplicate alert for 1 given scheduled job. It appears to be scheduled twice.
I see a delegated_remote_error and then it gets scheduled twice. Why is this happening?

05-20-2015 04:00:27.427 -0700 INFO  SavedSplunker - user="me", app="myapp", savedsearch_name="My Report", status=delegated_remote_error, scheduled_time=1432119600, peer_guid=ADDF764A-CCDF-478E-BF74-EE80F3A905EC
05-20-2015 04:00:31.015 -0700 INFO  SavedSplunker - user="me", app="myapp", savedsearch_name="My Report", status=delegated_remote, scheduled_time=1432119600, peer_guid=C0F9897B-3E7D-407E
-AF7C-573164F3EC04, sid = scheduler__me_UFZPLU9uc2FsZQ__RMD5d867250ff1e04aea_at_1432119600_54459_C0F9897B-3E7D-407E-AF7C-573164F3EC04
05-20-2015 04:00:55.352 -0700 INFO  SavedSplunker - user="me", app="myapp", savedsearch_name="My Report", status=delegated_remote, scheduled_time=1432119600, peer_guid=ADDF764A-CCDF-478E-BF74-EE80F3A905EC, sid = scheduler__me_UFZPLU9uc2FsZQ__RMD5d867250ff1e04aea_at_1432119600_62152_ADDF764A-CCDF-478E-BF74-EE80F3A905EC
05-20-2015 05:11:11.858 -0700 INFO  SavedSplunker - user="me", app="myapp", savedsearch_name="My Report", status=delegated_remote_completion, scheduled_time=1432119600, peer_guid=C0F9897B-3E7D-407E-AF7C-573164F3EC04, sid = scheduler__me_UFZPLU9uc2FsZQ__RMD5d867250ff1e04aea_at_1432119600_54459_C0F9897B-3E7D-407E-AF7C-573164F3EC04 success=1
05-20-2015 05:11:11.878 -0700 INFO  SavedSplunker - user="me", app="myapp", savedsearch_name="My Report", status=delegated_remote_completion, scheduled_time=1432119600, peer_guid=ADDF764A-CCDF-478E-BF74-EE80F3A905EC, sid = scheduler__me_UFZPLU9uc2FsZQ__RMD5d867250ff1e04aea_at_1432119600_62152_ADDF764A-CCDF-478E-BF74-EE80F3A905EC success=1

dmr195 · ‎07-07-2016

The sequence seems to be that the captain delegates the search to one search head, but suffers the error indicated by "status=delegated_remote_error" in the first log message in the original post. Then, understandably, it delegates the search to a different search head. But the first search head also runs the search.

I have seen the same problem, and when I looked at splunkd.log on the search head that the captain said "status=delegated_remote_error" for, I saw these errors:

06-29-2016 10:04:01.014 -0400 WARN  ISplunkDispatch - sid:scheduler__nobody__prelert__RMD57403f7e9934e6f78_at_1467208920_3438_6408AC6F-431C-41D6-B4EE-7BE2A978D5D6 Gave up waiting for the captain to establish a common bundle version across all search peers; using most recent bundles on all peers instead

06-29-2016 10:04:55.643 -0400 WARN  SHPMasterHTTPProxy - Low Level http request failure err=failed method=POST path=/services/shcluster/captain/jobs/scheduler__nobody__prelert__RMD57403f7e9934e6f78_at_1467208920_3438_6408AC6F-431C-41D6-B4EE-7BE2A978D5D6/report_job_completion captain=xsplunkm3d:8089 rc=0 actual_response_code=500 expected_response_code=200 status_line=Internal Server Error error="<response>\n  <messages>\n    <msg type="ERROR">\n In handler 'shclustercaptainjobs': job completion failed peer=6408AC6F-431C-41D6-B4EE-7BE2A978D5D6 sid=scheduler__nobody__prelert__RMD57403f7e9934e6f78_at_1467208920_3438_6408AC6F-431C-41D6-B4EE-7BE2A978D5D6 (reason: event=SHPMaster::handleDelegatedJobCompletion peer=6408AC6F-431C-41D6-B4EE-7BE2A978D5D6 job=scheduler__nobody__prelert__RMD57403f7e9934e6f78_at_1467208920_3438_6408AC6F-431C-41D6-B4EE-7BE2A978D5D6 did not exist in map)</msg>\n  </messages>\n</response>\n"

06-29-2016 10:04:55.643 -0400 WARN  SHPSlave - event=SHPSlave::delegatedJobCompletion Failed to notify captain of job completion job=scheduler__nobody__prelert__RMD57403f7e9934e6f78_at_1467208920_3438_6408AC6F-431C-41D6-B4EE-7BE2A978D5D6. reason=failed method=POST path=/services/shcluster/captain/jobs/scheduler__nobody__prelert__RMD57403f7e9934e6f78_at_1467208920_3438_6408AC6F-431C-41D6-B4EE-7BE2A978D5D6/report_job_completion captain=xsplunkm3d:8089 rc=0 actual_response_code=500 expected_response_code=200 status_line=Internal Server Error error="<response>\n  <messages>\n    <msg type="ERROR">\n In handler 'shclustercaptainjobs': job completion failed peer=6408AC6F-431C-41D6-B4EE-7BE2A978D5D6 sid=scheduler__nobody__prelert__RMD57403f7e9934e6f78_at_1467208920_3438_6408AC6F-431C-41D6-B4EE-7BE2A978D5D6 (reason: event=SHPMaster::handleDelegatedJobCompletion peer=6408AC6F-431C-41D6-B4EE-7BE2A978D5D6 job=scheduler__nobody__prelert__RMD57403f7e9934e6f78_at_1467208920_3438_6408AC6F-431C-41D6-B4EE-7BE2A978D5D6 did not exist in map)</msg>\n  </messages>\n</response>\n"

Does this give any clues to someone who knows the inner workings of this to say what's going wrong?

the_wolverine · ‎02-25-2016

Perhaps refer to this issue: https://answers.splunk.com/answers/373084/bug-why-are-my-summary-searches-being-re-run-occas.html

dmr195 · ‎07-07-2016

Not in all cases, as I have seen this problem with Splunk 6.3.3, and the bug you refer to is claimed to be fixed in 6.3.0

ben_leung · ‎11-26-2015

Can anyone provide a response as to what this means?

ben_leung · ‎02-22-2016

To my understand of the logs, the scheduled search "My Report" was ran twice on two different nodes of the search head cluster. Please confirm this?!

BP9906 · ‎02-22-2016

Yes, but I believe I found the issue in that I had 2 saved searches named the same and executing at the same time. 1 private, 1 app level. It hasnt happened since I reported this.

Search Head Clustering: Why am I getting a duplicate alert for one scheduled job?

Accelerating Observability as Code with the Splunk AI Assistant

Integrating Splunk Search API and Quarto to Create Reproducible Investigation ...

Congratulations to the 2025-2026 SplunkTrust!

Join the Conversation

Search Head Clustering: Why am I getting a duplicate alert for one scheduled job?

Accelerating Observability as Code with the Splunk AI Assistant

Integrating Splunk Search API and Quarto to Create Reproducible Investigation ...

Congratulations to the 2025-2026 SplunkTrust!