Splunk Search

What search can I use to know if a service had been restart after a certain period of time?

7ryota
Explorer

hi all,

i have this logs which i am interested in know if there is a agent restarted after certain period when the agent got stop

 

index=unix sourcetype=syslog

centrifyEventID=17000    Centrify agent (adclient) started
centrifyEventID=17002    Centrify agent (adclient) stopped   

 

can help to to construct the query to search to if the agent got started within 10mins after the agent got stop

Labels (1)
0 Karma

woodcock
Esteemed Legend

I am assuming that you asked this wrong and actually desire to know when there was a stop WITHOUT a start within 10-minutes.  If so:

index="unix" AND sourcetype="syslog" AND centrifyEventID IN("17000", "17002")
| reverse 
| streamstats count(eval(centrifyEventID="17000")) AS sessionID BY host 
| stats min(_time) AS _time range(_time) AS duration values(centrifyEventID) AS values dc(centrifyEventID) AS dc count BY sessionID host 
| rename COMMENT AS "Above is the setup, I *might* not have the logic exactly right below"
| where (dc==1 AND values=="17002" AND (now() - _time) > (10 * 60))
OR (dc==2 AND range > (10 * 60))
0 Karma

7ryota
Explorer

hi,

thanks for the fast reply,

how to i construct the query to search for unique host which had agent stop and search for restart within 10min

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @7ryota,

if you have more hosts, you have only to add host as grouping key in the stats command

index=your_index (centrifyEventID=17000 OR centrifyEventID=17002) earliest=-10m@m latest=now
| stats dc(centrifyEventID) AS centrifyEventID_count values(centrifyEventID) AS centrifyEventID BY host
| where centrifyEventID_count=1 AND centrifyEventID=17002

Ciao.

Giuseppe

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @7ryota,

the centrifyEventID should be automaticaly extracted by Splunk so you shouldn't need to extract them, if not true, please telle me that I add a regex extraction.

Anyway, you have to run an alert using a search like the following:

index=your_index (centrifyEventID=17000 OR centrifyEventID=17002) earliest=-10m@m latest=now
| stats dc(centrifyEventID) AS centrifyEventID_count values(centrifyEventID) AS centrifyEventID
| where centrifyEventID_count=1 AND centrifyEventID=17002

in this way, if you have results, this means that the service was stopped and there wasn't any start event in the last ten minutes.

Ciao.

Giuseppe

 

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...