All,
I have the PS input from Splunk for Unix enabled on all endpoints. Seems to be there should be an easy way to check running status of a process from 15 minutes ago to now and get a list of machines where the app has stopped.
thanks
-Daniel
edit: Here is what I came up with, but I figure there should be a better way to to this
index=os "auditbeat-god" sourcetype=ps earliest=-60m@m latest=-30m@m
| fields host
| dedup host
| table host
| append [ search
index=os "auditbeat-god" sourcetype=ps earliest=-30m@m latest=now
| fields host
| dedup host
| table host
]
| stats count by host
| where count < 2
Just search for the last hour and use
index=os "auditbeat-god" sourcetype=ps | timechart span=30m count by host