Hello, I have programs which write status events to Splunk. At the beginning they write EVENT=START and at the end, they write EVENT=END, both with a matching UID. I have created an alert which monitors for a START event without a corresponding END event, in order to find when a program may terminate abruptly. The alert is:
index=indxtst
| table _time source EVENT_TYPE EVENT_SUBTYPE UID EVENT
| eval stat=case(EVENT=="START","START",EVENT=="END","END")
| eventstats dc(stat) as dc_stat by UID
| search dc_stat=1 AND stat=START
This alert works fine, except sometimes it catches it while the program is running and simply hasn't written an END event yet. To fix this, I would like to add a delay, but that is not working.
index=indxtst
| table _time source EVENT_TYPE EVENT_SUBTYPE UID EVENT
| eval stat=case(EVENT=="START","START",EVENT=="END","END")
| eventstats dc(stat) as dc_stat by UID
| search dc_stat=1 AND stat=START AND earliest==-15m AND latest==-5m
This pulls back no records at all, even when appropriate testing data is created.
What am I doing wrong?
This requirement was solved with the following syntax:
index = indxtst
| table _time source EVENT_TYPE EVENT_SUBTYPE UID EVENT
| eval diff=now()-_time
| eval type=case(EVENT=="START","START",EVENT="END","END")
| eventstats dc(type) as dc_type by UID
| search dc_type=1 AND (type=START AND diff>300)
This requirement was solved with the following syntax:
index = indxtst
| table _time source EVENT_TYPE EVENT_SUBTYPE UID EVENT
| eval diff=now()-_time
| eval type=case(EVENT=="START","START",EVENT="END","END")
| eventstats dc(type) as dc_type by UID
| search dc_type=1 AND (type=START AND diff>300)
Hi @rdhdr ,
good for you, see next time!
Ciao and happy splunking
Giuseppe
P.S.: Karma Points are appreciated 😉
Hi, I guess the question I still need an answer to is, how can I apply a time restriction to the START event, but not the END event?
Cheers,
David
Hi @rdhdr ,
sorry but I don't understand what you mean with "Time restrictions"
You have to define a time period for yout check in which you can have Start and End events.
Obviously you could have events started before where the Start Event isn't in the time frame, but it's an issue inside the Splunk approach: you must define the time period to execute your searches.
Eventually you could use a larger time period.
Ciao.
Giuseppe
Hi @rdhdr ,
is there a wanted max time between the two events?
if yes, I'd use this:
index=indxtst earliest==-15m AND latest==-5m
| eval stat=case(EVENT=="START","START",EVENT=="END","END")
| stats
dc(stat) as dc_stat
earliest(eval(EVENT=="START")) AS earliest
latest(eval(EVENT=="END")) AS latest
values(source) AS source
values(EVENT_TYPE) AS EVENT_TYPE
values(EVENT_SUBTYPE) AS EVENT_SUBTYPE
values(EVENT) AS EVENT
by UID
| where (dc_stat=1 AND stat=START) OR latest-earliest>=600
| eval
earliest=straftime(earliest,"%Y-%m-%d %H:%M:%S"),
latest=if(isnull(latest),"No END event",straftime(latest,"%Y-%m-%d %H:%M:%S"))
| stats table _time source EVENT_TYPE EVENT_SUBTYPE UID EVENT
C iao.
Giuseppe
Thanks for the input, Giuseppe. I have not considered a max time between START and END events. I may need to think about that requirement.
I notice that you put
earliest==-15m AND latest==-5m
at the start of the query. It seems to me that this would check whether both START and END events are > 5 minutes old, which would be subject to the same issue I have today, in which the alert fires between START and END events.
What I think I need is to find a START event > 5 minutes old, with a corresponding END event of any age.
Cheers,
David
Hi @rdhdr ,
sorry, when I copied your conditions I forgot to use a larger time!
Anyway, let me know if I can help you more, or, please, accept one answer for the other people of Community.
Ciao and happy splunking
Giuseppe
P.S.: Karma Points are appreciated 😉