Alerting

How to create an alert if Event is NOT detected for 2 minutes ?

apomona
Explorer

Hello all, 

I am using SplunkCloud

I have looking on the forum yesterday in order to create an alert when an Event is not detected. 

My idea is to send a mail when the Event 4776 is not detected. 

The closer I have is this : 

index ="*" | where ComputerName="ComputerName" | search EventCode=4776

This gives me every event 4776 on the device ComputerName

I wanted to add  earliest=-2m@m latest=-1m@m like I saw on different places but the result goes to 0 while I know this event is sent multiple times per second (multiple like 100 times)

 

Second question, when I save as an Alert, I specify :

Real Time, 

Trigger when Specified : search count =0 

Is this right ?

I saw people saying results=0 but I have this error : Cannot parse alert condition. Unknown search command 'results'..

Thanks for the help

 

 

 

 

Labels (1)
Tags (2)
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @apomona,

the question is: on how many servers do you want to check if the above EventCode isn't present?

or, in othe words, have you a monitoring perimeter?

if yes, put it in a lookup (called e.g. perimeter.csv) containing at least one column (called host), and then run a search like this:

| index=wineventlog sourcetype=xmlwineventlog EventCode=4776
| stats count BY host
| append [ 
    | inputlookup
    | eval count=0
    | fields host count
    ]
| stats sum(count) AS total BY host
| where total=0

if you haven't a perimeter and you're sure that at least in the last hour you received at least one event with this EventCode, you could try:

| index=wineventlog sourcetype=xmlwineventlog EventCode=4776 earliest=-60m@m latest=@m
| eval period=if(_time>now()-3600,"Last","Previous")
| stats 
    dc(period) AS period_count 
    values(period) AS period
    latest(_time) AS _time 
    BY host
| where period_count=1 AND period="Previous"

I prefer the first solution because gives you more control: using the second one, you check only hosts in tha last hour.

About the second question, avoid to use Real Time alerts because a Real Time search takes a CPU and doesn't release never!

It's alway better to run a scheduled search (e.g. every 5 minutes), choose the frequency more adapt to your requirements.

About the condition, using my solution you can trigger the alert when you have results (results>0).

Ciao.

Giuseppe

apomona
Explorer

Hello, 

 

Thanks for your message. 

I have indeed 4 devices (DomainController) I want to check. 

I created the CSV file DomainController.csv with 1 column called host. 

I want to specify that I always receive this Event and I want the alert to trigger when I stop receiving it for 1 minut

Here is the Splunk query I use . Tell me if this is right : 

index="ad_windows" EventCode=4776
| stats count BY host
| append [
| inputlookup DomainController.csv
| eval count=0
| fields host count
]
| stats sum(count) AS total BY host
| where total=0

 

The resulst is a table with each DC on first column and 0 on the column called total

 

-------------------------------------------------------

However, 

I know for a fact that one of my DC shut down for 5 minutes this morning so it means I stopped receiving the Event 4776 for 5 minuts. But when I use your query with selected times on the shutdown period, I still have 0.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @apomona ,

this search is an alert that triggers when the EventCode is missed saing that for the missed host you didn't received any event i the last minute, bt you haven't information about when you received the last event.

if you want a report about the periods when the eventCode is missed, you should use a different search:

index="ad_windows" EventCode=4776 earliest=-60m@m latest=@m
| eval period=if(_time>now()-60,"Last","Previous")
| stats 
    count(eval(period="Last")) AS count
    latest(_time) AS _time
    BY host
| append [
| inputlookup DomainController.csv
| eval count=0
| fields host count
]
| stats sum(count) AS total BY host
| where total=0
| table host _time

in this way, you check if all the hosts in the lookup sent events with the above EventCode and, when missed, also the last event in the last hour.

Ciao.

Giuseppe

apomona
Explorer

Hello @gcusello , 

 

I think I am getting it. 

So right now, I change so I can check for the last 2 minuts and I have in result a table with host and _time. 

In _time, I have null as a result because the event is accuring in the last 2 minuts for every host. 

 

When I want to set it up as an alert. 

I should say   : 

Alert Type : Real Time => so alert is running continuously ? I want it to run minimum every minute

Expire : 365 days => the alert will run for the next year 

Condition of triggering : Per results => meaning whenever the _time <> null, I will have a message ? 

Or should i do : Personnalized : _time <> null 

 

 

Thanks for helping me in my new journey in Splunk 🙂

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @apomona ,

answering to your questions:

in _time you have the last occurrence of the EventCode, if there isn't any occurrence in the period, you don't have any value, in this case you could add a message instead of zero:

index="ad_windows" EventCode=4776 earliest=-60m@m latest=@m
| eval period=if(_time>now()-60,"Last","Previous")
| stats 
    count(eval(period="Last")) AS count
    latest(_time) AS _time
    BY host
| append [
| inputlookup DomainController.csv
| eval count=0
| fields host count
]
| stats sum(count) AS total BY host
| where total=0
| eval _time=if(_time=0,"No events in the period",_time)
| table host _time

Avoid to use Real time, because these searches are very heavy for the system: each search takes a CPU and release it when finisces, but RT searches never finish.

It's better a scheduled search, even if every minute.

Expire, in Splunk there isn't an expiring period for an alert; the expiring period that you see in the alert is for the results (usually 1 day or 1 week) one year I think that's too large and disk space consuming.

No, if the alert doesn't trigger an alert condition you don't have a message, if you want a message, you have to use a different search, but what's the utility of a message that's all ok in an alert? an alert should trigger only an error condition, not an OK conditon.

Ciao.

Giuseppe

0 Karma

apomona
Explorer

Hi @gcusello,

I have the Event 4776 occuring often but I have nothing in the table (see attached PDF). 

The alert I want is a mail when the Event 4776 is not occuring on one of the Domain Controller. 

For exemple, if I don't have the event for 2 minutes, this is critical.

So in Alert, I want a mail when the last Event ocurred more than 2 minutes ago for exemple. Or a message if I dont have the Event for 2 minutes.

Thanks for the detail regarding the Alert parameters in Splunk. 

 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @apomona ,

with my search you have the list of servers where the EventCode 4776 is't present in the last two minutes.

The other time is required to have (if present) the last occurrence of the event:

index="ad_windows" EventCode=4776 earliest=-60m@m
| eval period=if(_time>now()-120,"Last","Previous")
| stats 
    count(eval(period="Last")) AS count
    latest(_time) AS _time
    BY host
| append [
| inputlookup DomainController.csv
| eval count=0
| fields host count
]
| stats 
    sum(count) AS total 
    latest(_time) AS _time
    BY host
| where total=0
| eval _time=if(_time=0,"No events in the period",_time)
| table host _time

Ciao.

Giuseppe

0 Karma

apomona
Explorer

Hello, 

My events are indexed pretty fastly (really close to real time, maybe 1 or 2 seconds delay tops)

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

How quickly are your events getting indexed? For example, if you look at the _indextime field and compare it to the _time field you will notice a lag. If this is over a minute, it could be that the events between -2m and -1m are not indexed before -1m and would therefore not show up in your alert search

0 Karma

apomona
Explorer

Hello, 

My events are indexed pretty fastly (really close to real time, maybe 1 or 2 seconds delay tops)

 

I mean when I have the search 

index ="*" | where ComputerName="ComputerName" | search EventCode=4776

I have all events even the one in real time

 

But whenever I add earliest=-10m latest=-2m for example, then I dont have any result left.

0 Karma
Get Updates on the Splunk Community!

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...

Observability protocols to know about

Observability protocols define the specifications or formats for collecting, encoding, transporting, and ...

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...