Alerting

Alert randomly stops working

mscomms
Path Finder

Hi All,

I am seeing a strange issue where occaisionally one of my alerts stop working ( not always the same one ). When this issue is happening I can see the searches running but there are no triggers happening for the alert even when manually running the search finds the events.

I have tweaked the searches to make sure I am not falling foul of the _indextime vs _time issue caused by alerts arriving outside the search window.

It appears that the search just stops triggering and it starts again when I Disable/Enable the search.

Anyone else seeing this or have any ideas?

Labels (1)
0 Karma

mscomms
Path Finder

I see nothing relating to this search in the audit.log

0 Karma

mscomms
Path Finder

As I said above we are seeing this issue with Real-Time and Scheduled, I have the issue happening right now witha a Real-Time search in my test environment.

I am firing test events into an existing index, I can see them arriving in the index, if I run the search in the alert as a manual search it shows the event but the the action isn't triggering and when I run 

index="_internal" sourcetype!=splunkd_remote_searches savedsearch savedsearch_name=NOC_Alcatel result_count=1

I see hits from my first few test alerts but nothing from after it stopped responding.

Yes we are running a 3 node cluster.

I'm not seeing any issues in DMC the skip ratio over the last 4 hours is 0%

I am seeing these every minute

08-26-2021 10:59:01.230 +0100 INFO SHCMaster - delegate search job requested for savedsearch_name="NOC_Alcatel"
08-26-2021 10:59:01.230 +0100 INFO SHCMaster - realtime search savedsearch_name=NOC_Alcatel selector=nobody;noc;NOC_Alcatel sid=rt_scheduler__admin__noc__RMD5aff146651f76f3cb_at_1629386520_7_5E69A627-7FCE-4CAE-A9C4-E72E65CBF04A is either already running or being dispatched. Ignoring request.

but I am also seeing these going back forever so I dont think they are anything to do with this issue

 

 

0 Karma

mscomms
Path Finder

Wer are seeing the same issue with both but are in the process of migrating from Real-Time to scheduled as the scheduled gives faster results

 

0 Karma

burwell
SplunkTrust
SplunkTrust

Hi .. under alert type do you have Scheduled or Real-time?

Also, not sure if you have a search head cluster and a monitoring console/DMC. That might show you skipped searches etc.

Finally there are records for the saved search in the audit log. This might help you see if the the scheduler believes that the jobs were completed.

0 Karma
Get Updates on the Splunk Community!

Observe and Secure All Apps with Splunk

  Join Us for Our Next Tech Talk: Observe and Secure All Apps with SplunkAs organizations continue to innovate ...

Splunk Decoded: Business Transactions vs Business IQ

It’s the morning of Black Friday, and your e-commerce site is handling 10x normal traffic. Orders are flowing, ...

Fastest way to demo Observability

I’ve been having a lot of fun learning about Kubernetes and Observability. I set myself an interesting ...