Alerting

Creating alerts

Deepz2612
Explorer

Hello,
Please help me with the below.
My search has to find the keyword "Service.com" and if found should search for the keyword "connection reset error" in the next few events and if found if the time difference between the events is less than or equal to 1 min to should fire an alert

Tags (1)
0 Karma
1 Solution

niketn
Legend

@Deepz2612, please try the following search (add your base search with index and sourcetype).

Following query gets the latest events with service.com and connection reset error. It then calculates the duration (in minutes) between the two events.

<YourBaseSearch> "Service.com"
    |  head 1 
|  append [search <YourBaseSearch> "connection reset error"
    |  head 1]
|  stats count as eventCount earliest(_time) as startTime latest(_time) as endTime
|  eval durationInMin=case(eventCount=2,round((endTime-startTime)/60,0),true(),"NoAlert")
|  fieldformat startTime=strftime(startTime,"%Y/%m/%d %H:%M:%S %p")
|  fieldformat endTime=strftime(endTime,"%Y/%m/%d %H:%M:%S %p")

You can create an alert with trigger condition as durationInMin < 2.
PS: For the time duration for which you will schedule the alert (lets say last 15 minutes), in case you get only one event (be it connection reset error or service.com event), it will not be an alert situation as per your description. So I have added NoAlert value for durationInMin. You can change it to any positive number >2 so that it does not get reported as alert.


[Update]

Adding description as requested:

  • First two search with | head 1 command gets latest results logged in Splunk for "Service.com" and "connection reset error"
  • stats is used to aggregate the two results saves earliest and latest time as epoch time to be used in next step for calculating duration.
  • If only single event is found (eventcount=1), then we need not alert. So the duration (durationInMin) is calculated only if both events are present.
  • Finally fieldformat is used to convert epoch time to string time.

While creating the alert you can set Alert trigger condition based on durationInMin field.

Please try out and confirm. Let us know if you need further clarifications.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

View solution in original post

0 Karma

Deepz2612
Explorer

Hi Nike,
Thanks for the response.
Could you please take some moment to explain the query.

0 Karma

niketn
Legend

@Deepz2612, I have added the description of query.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

AAF
New Member

HI,

from the Palo Alto Networks App I created search for critical allowed and I have issue to find the alert to do some changes. please advise.
Thanks

0 Karma

niketn
Legend

@Deepz2612, please try the following search (add your base search with index and sourcetype).

Following query gets the latest events with service.com and connection reset error. It then calculates the duration (in minutes) between the two events.

<YourBaseSearch> "Service.com"
    |  head 1 
|  append [search <YourBaseSearch> "connection reset error"
    |  head 1]
|  stats count as eventCount earliest(_time) as startTime latest(_time) as endTime
|  eval durationInMin=case(eventCount=2,round((endTime-startTime)/60,0),true(),"NoAlert")
|  fieldformat startTime=strftime(startTime,"%Y/%m/%d %H:%M:%S %p")
|  fieldformat endTime=strftime(endTime,"%Y/%m/%d %H:%M:%S %p")

You can create an alert with trigger condition as durationInMin < 2.
PS: For the time duration for which you will schedule the alert (lets say last 15 minutes), in case you get only one event (be it connection reset error or service.com event), it will not be an alert situation as per your description. So I have added NoAlert value for durationInMin. You can change it to any positive number >2 so that it does not get reported as alert.


[Update]

Adding description as requested:

  • First two search with | head 1 command gets latest results logged in Splunk for "Service.com" and "connection reset error"
  • stats is used to aggregate the two results saves earliest and latest time as epoch time to be used in next step for calculating duration.
  • If only single event is found (eventcount=1), then we need not alert. So the duration (durationInMin) is calculated only if both events are present.
  • Finally fieldformat is used to convert epoch time to string time.

While creating the alert you can set Alert trigger condition based on durationInMin field.

Please try out and confirm. Let us know if you need further clarifications.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

Deepz2612
Explorer

@niketnilay Thanks for the explanation.
I setup the alert for every 5 mins for the timerange of last 10 mins and it worked!
Thanks for your help!

0 Karma

elliotproebstel
Champion

@niketnilay - As I read it, this approach will only compare the most recent service.com event with the most recent connection reset error. So if the time window contains multiple events of either type (some of which would have generated an alert), but the most recent ones do not generate an alert - matches will be missed, right? To ensure you aren't missing earlier alert matches, I think you'd need a streamstats approach.

@Deepz2612 - Do your requirements proscribe an ordering for these two events? I notice you said that if a "service.com" alert is found, it "should search for the keyword "connection reset error" in the next few events..." Splunk returns events in reverse-chronological order by default, so looking in the "next few events" in Splunk would mean looking at the few events that occurred prior to the service.com alert. Does it matter if the service.com event happens before or after the connection reset error event? Also, if you are searching every, say, 15 minutes - do you want a total count of instances where service.com events happened within one minute of a connection reset error event, or just one alert total to indicate that at least once in that time window the two events occurred within one minute of each other?

0 Karma

niketn
Legend

@elliotproebstel, the Search command itself will run for a duration. For example I would be running above search every 10 minutes for last 30 minutes and once alert is triggered I can throttle for next 30 min.

There could be one more possibility of changing the query so that if there is a "connection reset error" event without Service.com, we can evaluate the same in case() block and compare starttime with current time i.e. now() for setting durationInMin.

However, I did not want to complicate since based on the question, seemed like "connection reset error" was supposed to be correlated to "Service.com" event. In other words error will not happen until Service request is made.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma
Get Updates on the Splunk Community!

Earn a $35 Gift Card for Answering our Splunk Admins & App Developer Survey

Survey for Splunk Admins and App Developers is open now! | Earn a $35 gift card!      Hello there,  Splunk ...

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

You’ve probably heard the latest about AppDynamics joining the Splunk Observability portfolio, deepening our ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

As we’ve seen, integrating Kubernetes environments with Splunk Observability Cloud is a quick and easy way to ...