Alerting

Alert randomly stops working

mscomms
Path Finder

Hi All,

I am seeing a strange issue where occaisionally one of my alerts stop working ( not always the same one ). When this issue is happening I can see the searches running but there are no triggers happening for the alert even when manually running the search finds the events.

I have tweaked the searches to make sure I am not falling foul of the _indextime vs _time issue caused by alerts arriving outside the search window.

It appears that the search just stops triggering and it starts again when I Disable/Enable the search.

Anyone else seeing this or have any ideas?

Labels (1)
0 Karma

mscomms
Path Finder

I see nothing relating to this search in the audit.log

0 Karma

mscomms
Path Finder

As I said above we are seeing this issue with Real-Time and Scheduled, I have the issue happening right now witha a Real-Time search in my test environment.

I am firing test events into an existing index, I can see them arriving in the index, if I run the search in the alert as a manual search it shows the event but the the action isn't triggering and when I run 

index="_internal" sourcetype!=splunkd_remote_searches savedsearch savedsearch_name=NOC_Alcatel result_count=1

I see hits from my first few test alerts but nothing from after it stopped responding.

Yes we are running a 3 node cluster.

I'm not seeing any issues in DMC the skip ratio over the last 4 hours is 0%

I am seeing these every minute

08-26-2021 10:59:01.230 +0100 INFO SHCMaster - delegate search job requested for savedsearch_name="NOC_Alcatel"
08-26-2021 10:59:01.230 +0100 INFO SHCMaster - realtime search savedsearch_name=NOC_Alcatel selector=nobody;noc;NOC_Alcatel sid=rt_scheduler__admin__noc__RMD5aff146651f76f3cb_at_1629386520_7_5E69A627-7FCE-4CAE-A9C4-E72E65CBF04A is either already running or being dispatched. Ignoring request.

but I am also seeing these going back forever so I dont think they are anything to do with this issue

 

 

0 Karma

mscomms
Path Finder

Wer are seeing the same issue with both but are in the process of migrating from Real-Time to scheduled as the scheduled gives faster results

 

0 Karma

burwell
SplunkTrust
SplunkTrust

Hi .. under alert type do you have Scheduled or Real-time?

Also, not sure if you have a search head cluster and a monitoring console/DMC. That might show you skipped searches etc.

Finally there are records for the saved search in the audit log. This might help you see if the the scheduler believes that the jobs were completed.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Agent Mode Engaged! Enchaining Agentic Operations with Splunk AI Assistant 2.0

    Are you ready to transform how your team handles complex data requests? We invite you to our upcoming ...

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

We are excited to announce that the upcoming releases of Splunk Enterprise 10.2.x and Splunk Cloud Platform ...