Splunk Search

How to detect message rate drop off using real-time queries

gregbujak
Path Finder

In the context of heartbeat message detection, I would like to detect when these heartbeats stop.

ex.

  • t0: 12/17/2010 10:44:30 heartbeat=systemX
  • t0: 12/17/2010 10:44:30 heartbeat=systemY
  • t1: 12/17/2010 10:44:35 heartbeat=systemX
  • t1: 12/17/2010 10:44:35 heartbeat=systemY
  • t2: 12/17/2010 10:44:40 heartbeat=systemX
  • t2: 12/17/2010 10:44:40 heartbeat=systemY
  • t3: 12/17/2010 10:44:45 heartbeat=systemX
  • t4: 12/17/2010 10:44:50 heartbeat=systemX
  • t5: 12/17/2010 10:44:55 heartbeat=systemX

Fact list:

  1. 2 systems are up and running
  2. t1, the count for each is greater then 0 for the last 10 second interval - all good
  3. t5 the count = 0 for an interval of 10s for systemY.

I would like to publish a message saying the count=0 for systemY with the understanding that it was absent for the last 10 second sampling rate.

I know how to sample the count for an interval of 10s, but the problem is that if the count=0, you have no events to work with. So it needs to be correlated to an outer query based on the heartbeat. Any help would be appreciated.

Tags (1)
0 Karma

TheGU
Path Finder

Assume that you already extract heartbeat=* to a field name "heartbeat" Try : set time to realtime 15s windows

sourcetype="heartbeatlog" | stats count by heartbeat | where count < 2
0 Karma

gregbujak
Path Finder

Thanks for the suggestion, the problem is that I need the event to be emitted when the count=0. With the above solution, it means that the event will be emitted only when the count is 1. When count = 0, it means there are no events for that heartbeat type and the event will disappear, leaving the user thinking that the heartbeat drop off has ended. I think it needs to be correlated against a greater time span, where the event does exist.

0 Karma
Get Updates on the Splunk Community!

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...

Splunk Observability for AI

Don’t miss out on an exciting Tech Talk on Splunk Observability for AI!Discover how Splunk’s agentic AI ...

🔐 Trust at Every Hop: How mTLS in Splunk Enterprise 10.0 Makes Security Simpler

From Idea to Implementation: Why Splunk Built mTLS into Splunk Enterprise 10.0  mTLS wasn’t just a checkbox ...