Alerting

How to exclude previous events in alert throttling?

zapping575
Path Finder

I have a scheduled savedsearch that may return a result such as this

_time, host, _raw

  • 2023-01-01, host A, <some message>
  • 2023-01-02, host A, <some message>
  • 2023-01-03, host A, <some message>

In this example, the content of <some message> causes an alert to fire, which is what I expect.

Now, assume that a new event occurs and the next scheduled search returns this (changes in bold):

  • 2023-01-01, host A, <some message>
  • 2023-01-02, host A, <some message>
  • 2023-01-03, host A, <some message>
  • 2023-01-04, host A, <some message>
  • 2023-01-05, host A, <some message>

Problem: The next scheduled search will return the entire list (5 events) and thus trigger an alert containing these 5 events. However, 3 of these events were contained in a previous alert and are thus superfluous.

Desired outcome: The new alert should only be triggered based on the two "new" events (in bold)

What I have tried: Set trigger type to "for each event" and suppress for fields _time and host because I would assume that the combination of _time and host will uniquely identify the event to suppress

I also tried to learn about dynamic input lookups, but the documentation seems to be lost / unavailable (http://wiki.splunk.com/Dynamically_Editing_Lookup_Tables)

Labels (3)
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @zapping575,

probably the time frame you're using in your alert is greather than the scheduling period, so there's an overlapping of events.

The only solution, for my knowledge, is to adapt the time frame to the scheduling period: 

if your alert is scheduled every 5 minutes, you have to use five minutes as time frame,

it's also better to use the start of the minute to avoid overlapping: e.g. for the last 5 minutes, use as timeframe earliest=-5m@m latest=@m.

Ciao.

Giuseppe

0 Karma

zapping575
Path Finder

Ciao @gcusello

Your assumption about the time frame is indeed correct.

I have to search within a timeframe of the last 30 days. This is a requirement because I may only receive the data in an asynchronous manner (the most recent event in a new file might already be a day old, or even older, when I receive it)

This leads to potential problems if I want to equal the scheduling period with the time frame, as you suggest.

  • If I select a short scheduling period, there is a very real chance that I will miss out on some events that are no longer included in the time frame
  • If I select a long scheduling period, there is a very real chance that I will only notice an event long after it occurred.

 

Regards

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @zapping575,

you could try to list the results of previous alerts in a lookup or (better) in a summary index (using collect) and then filter your results with them.

Ciao.

Giuseppe

0 Karma

zapping575
Path Finder

Ciao @gcusello 

Could you please direct me towards a wiki or manual article on how to acheive this?

Regards

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @zapping575,

adapt my approach to your search:

if there's an event_id to identify events

<your_search> 
| stats count BY event_id
| where count>10
| search NOT [ search index=summary_alerts | fields event_id ]
| collect index=summary alerts

in this way, you run your search and check the condition.

Then after the check, you filter results discarding  the already indexed event_ids and you add to the summary index only the new ones.

Ciao.

Giuseppe

0 Karma

zapping575
Path Finder

Ciao @gcusello 
Before going over to summary indeces, I would like to implement this using lookups instead.

The Logic is the same as with summary indices:

  • Run the search, check if any results are already included in the lookup file
  • If there are any, remove the duplicates
  • Count the remaining events
  • Fire Alert (if desired)
  • Write back / append the new events to lookup

I cannot however get the inputlookup in a subsearch to work. Here is a very basic example:

Note: source_file is a field that I extract in props.conf

index = myIndex eventtype = myEventtype
| fields _time, host, source_file
| search NOT 
    [ | inputlookup known-events.csv 
    | fields _time, host, source_file ]

I have added a single line to known-events.csv (for testing purposes), so I would expect that the number of results for "myIndex" and "myEventtype" would decrease by one, but I am getting either the same results as before or none at all. I have confirmed that the single line in known-events.csv is acutally there.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @zapping575,

don' use the filter in a following statement, puy it in the main search so your search will be faster:

index=myIndex eventtype=myEventtype NOT [ | inputlookup known-events.csv | fields _time, host, source_file ]

the most important think is that the field names in subsearch will be the same of the main search.

Ciao.

Giuseppe

0 Karma

zapping575
Path Finder

Ciao @gcusello 

Thank you for your continued help.

I must be doing something fundamentally wrong. If I run the search as you describe it, it returns zero results.

Current setup

  • Index = "myIndex" eventtype = "myEventtype" returns 1196 events
  • Lookup file "known-events" contains a single event, identified by the "composite primary key": _time, host, source_file
  • I would thus expect that the query you provided returns 1196 - 1 results.

To  adress your point and make sure that the field names are the exact same, I tried this:

index = myIndex eventtype = myEventtype
| fields _time, host, source_file
    NOT 
    [| inputlookup known-events.csv 
    | fields _time, host, source_file ]

This gives the following error:

Error in 'fields' command: Invalid argument: 'source_file=some_file_name-[some-host_name].txt' 

 I am not quite sure what to make of this

0 Karma

zapping575
Path Finder

Ciao @gcusello 

thank you very much for your suggestion.

I will post again as soon as I managed to make it work.

0 Karma
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.0.2 Availability: On cloud and On-premise!

A few months ago, we released Splunk Enterprise Security 8.0 for our cloud customers. Today, we are excited to ...

Logs to Metrics

Logs and Metrics Logs are generally unstructured text or structured events emitted by applications and written ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...