Hello,
I've a simple requirement but new to Splunk so facing some challenges and hoping for some luck!
My application writes HEARTBEAT messages every 2 min to log files to multiple sources. I'm just trying to create an alert and send email if heartbeat messages aren't written in last 5 min.
It may look simple but I also need to know which sources doesn't have heartbeat messages.
I've tried with below query which works but sometimes giving me incorrect results. So, looking for an better and simple solution.
index = index1 earliest=-5m latest=now source IN (dev-*api.log) ("testapi" AND "HEARTBEAT")
| fields source
| append [ search index = index1 earliest=-2w@w0 latest=now source IN (dev-*api.log) ("testapi" AND "HEARTBEAT")
| stats dc(source) as source_list by source
| fields source
]
| rex field=_raw "HEARTBEAT for (?<APIName>.*).jar (?<Version>.*)"
| stats count as #heartbeats, latest(Version) as Versions by APIName, JVM
| eval Status=case(('#heartbeats' <= 1 OR isnull('#heartbeats')), "NOT RUNNING", '#heartbeats' > 1, "RUNNING")
| table APIName, Versions, Status
Appreciate the help! Thanks.
Hi @nnkreddy,
if you're confident that you received an event in the last 24 hours, you could run something like this:
index = index1 earliest=-24h latest=now source IN (dev-*api.log) ("testapi" AND "HEARTBEAT")
| stats latest(_time) AS latest BY APIName, JVM
| where latest>now()-300
If you're not sure that you received at least one event in the last 24 hours, you have to create a lookup (called e.g. perimeter.csv) containing all the APIName and JVM to monitor, then you can run something like this:
index = index1 earliest=-m5 latest=now source IN (dev-*api.log) ("testapi" AND "HEARTBEAT")
| stats count BY APIName, JVM
| append [ | inputlookup perimeter.cv | eval count=0 | fields APIName JVM count ]
| stats sum(count) AS total BY APIName, JVM
| where total=0
The second search is less heavy and long to execute and gives more control, but requires to manage the lookup.
Ciao.
Giuseppe
Hi @gcusello,
Option 1 is the smart solution without complicating it - its working perfectly fine! Thanks for the help.
Hi @nnkreddy,
if you're confident that you received an event in the last 24 hours, you could run something like this:
index = index1 earliest=-24h latest=now source IN (dev-*api.log) ("testapi" AND "HEARTBEAT")
| stats latest(_time) AS latest BY APIName, JVM
| where latest>now()-300
If you're not sure that you received at least one event in the last 24 hours, you have to create a lookup (called e.g. perimeter.csv) containing all the APIName and JVM to monitor, then you can run something like this:
index = index1 earliest=-m5 latest=now source IN (dev-*api.log) ("testapi" AND "HEARTBEAT")
| stats count BY APIName, JVM
| append [ | inputlookup perimeter.cv | eval count=0 | fields APIName JVM count ]
| stats sum(count) AS total BY APIName, JVM
| where total=0
The second search is less heavy and long to execute and gives more control, but requires to manage the lookup.
Ciao.
Giuseppe