Please help me with the below.
My search has to find the keyword "Service.com" and if found should search for the keyword "connection reset error" in the next few events and if found if the time difference between the events is less than or equal to 1 min to should fire an alert
@Deepz2612, please try the following search (add your base search with index and sourcetype).
Following query gets the latest events with
connection reset error. It then calculates the duration (in minutes) between the two events.
<YourBaseSearch> "Service.com" | head 1 | append [search <YourBaseSearch> "connection reset error" | head 1] | stats count as eventCount earliest(_time) as startTime latest(_time) as endTime | eval durationInMin=case(eventCount=2,round((endTime-startTime)/60,0),true(),"NoAlert") | fieldformat startTime=strftime(startTime,"%Y/%m/%d %H:%M:%S %p") | fieldformat endTime=strftime(endTime,"%Y/%m/%d %H:%M:%S %p")
You can create an alert with trigger condition as
durationInMin < 2.
PS: For the time duration for which you will schedule the alert (lets say last 15 minutes), in case you get only one event (be it
connection reset error or
service.com event), it will not be an alert situation as per your description. So I have added
NoAlert value for durationInMin. You can change it to any positive number >2 so that it does not get reported as alert.
Adding description as requested:
| head 1command gets latest results logged in Splunk for "Service.com" and "connection reset error"
statsis used to aggregate the two results saves earliest and latest time as epoch time to be used in next step for calculating duration.
durationInMin) is calculated only if both events are present.
While creating the alert you can set Alert trigger condition based on
Please try out and confirm. Let us know if you need further clarifications.
@niketnilay - As I read it, this approach will only compare the most recent
service.com event with the most recent
connection reset error. So if the time window contains multiple events of either type (some of which would have generated an alert), but the most recent ones do not generate an alert - matches will be missed, right? To ensure you aren't missing earlier alert matches, I think you'd need a
@Deepz2612 - Do your requirements proscribe an ordering for these two events? I notice you said that if a "service.com" alert is found, it "should search for the keyword "connection reset error" in the next few events..." Splunk returns events in reverse-chronological order by default, so looking in the "next few events" in Splunk would mean looking at the few events that occurred prior to the
service.com alert. Does it matter if the
service.com event happens before or after the
connection reset error event? Also, if you are searching every, say, 15 minutes - do you want a total count of instances where
service.com events happened within one minute of a
connection reset error event, or just one alert total to indicate that at least once in that time window the two events occurred within one minute of each other?
@elliotproebstel, the Search command itself will run for a duration. For example I would be running above search every 10 minutes for last 30 minutes and once alert is triggered I can throttle for next 30 min.
There could be one more possibility of changing the query so that if there is a "connection reset error" event without Service.com, we can evaluate the same in
case() block and compare
starttime with current time i.e.
now() for setting
However, I did not want to complicate since based on the question, seemed like "connection reset error" was supposed to be correlated to "Service.com" event. In other words error will not happen until Service request is made.
@niketnilay Thanks for the explanation.
I setup the alert for every 5 mins for the timerange of last 10 mins and it worked!
Thanks for your help!
from the Palo Alto Networks App I created search for critical allowed and I have issue to find the alert to do some changes. please advise.