I have been trying to create an alert that triggers whenever the process ID of a process on linux is null. Because it is not sending data, I assume the process is not running, and if it has a process ID, it is running.
Working with telegraf:
| mstats latest(_value) AS value WHERE metric_name="procstat.pid" AND index="telegraf" AND process_name="<process_name>" fillnull_value=0 span=5m BY host, process_name
| timechart latest(value) span=5m BY host
| fillnull <hostnames> value=0 | table _time,<hostnames>
Using the zero null values formatting, I can pinpoint exactly when the processes are on downtime. However, I couldn't find a way to alert when the host PID value is null (or =0 due to the fillnull function).
Adding | where host=0 to the end of the query will filter the results to only those that are null/0. Then have the alert trigger if you get any results.
The problem I face now is I only want to table the values that are = 0 to show it in the alert description notificacion, for example, when I send an email the moment the alert triggers:
| mstats latest(_value) AS value WHERE metric_name="procstat.pid" AND index="telegraf" AND process_name="PSBRKDSP" span=5m BY host, process_name
| timechart latest(value) span=5m BY host
| fillnull host1,host2,host3,host4 value=0
| where host1=0 OR host2=0 OR host3=0 OR host4=0
| table _time,host1,host2,host3,host4
I don't know a way to show only the fields that are zero.
Adding | where host=0 to the end of the query will filter the results to only those that are null/0. Then have the alert trigger if you get any results.