Splunk Search

How to determine if event is in list of outages?

apps_inpaytech
Explorer

I have a seemingly simple request: list the events and indicate if it occurred during an outage.

I have been trying for ages and I cannot get it to work, can anyone please help?

Base search for events: 

index=api_calls

CSV lookup to record the outage windows, called 'outages.csv' (UK style dates):

DateFrom DateTo Reason
01/09/2022 09:00:00  30/09/2022 23:00:00 Testing 1
01/10/2022 09:00:00 31/10/2022 09:00:00 Testing 2

 

This produces the correct outage row:

| inputlookup outages.csv 
| eval t=now()
| eval DateFromEpoch=strptime(DateFrom, "%d/%m/%Y %H:%M:%S")
| eval DateToEpoch=strptime(DateTo, "%d/%m/%Y %H:%M:%S")
| where DateFromEpoch <= t and DateToEpoch >= t
| table Reason

Output is: Testing 2

I would have expected this to add the Reason field to the base results:

index=api_calls
| append
[ inputlookup outages.csv
| eval t=_time
| eval DateFromEpoch=strptime(DateFrom, "%d/%m/%Y %H:%M:%S")
| eval DateToEpoch=strptime(DateTo, "%d/%m/%Y %H:%M:%S")
| where DateFromEpoch <= t and DateToEpoch >= t
| table Reason ]
| table _time Reason *

But for some reason I cannot get anything to add to the search, not even

index=api_calls
| append [ | makeresults
  | eval Reason="hello" | table Reason ]
| table _time Reason *

Ideally, I would like this to be as a macro so I can re-use it easily:

index=api_calls
| `is_outage(_time)`
| table _time Reason *

I'm doing something wrong, any help appreciated.

Labels (1)
0 Karma
1 Solution

johnhuang
Motivator

This is how you could configure a time based lookup which should be magnitudes faster than using append.

1. Create outages_time_lookup.csv from your original lookup:

| inputlookup outages.csv
| eval DateFromEpoch=strptime(DateFrom, "%d/%m/%Y %H:%M:%S")
| eval DateToEpoch=strptime(DateTo, "%d/%m/%Y %H:%M:%S")
| eval lookup_value=1
| table lookup_value DateFromEpoch DateToEpoch Reason
| outputlookup outages_time_lookup.csv

 

2. Configure a new lookup definition (use default value for anything not defined below):

Name: outages_time_lookup
Lookup file: outages_time_lookup.csv
Name of time field: DateFromEpoch
Minimum offset: 0
Advanced Options -> Maximum matches: 1

3. Sample query using the time based lookup

index=api_calls
| eval lookup_value=1
| lookup outages_time_lookup lookup_value OUTPUT DateToEpoch Reason
| eval is_outage=IF(_time<DateToEpoch, "Y", "N")
| eval outage_reason=IF(_time<DateToEpoch, Reason, "")

 

View solution in original post

richgalloway
SplunkTrust
SplunkTrust

It would help to see a sample expected output, but it looks like you've omitted a key step.

The append command runs a search and puts the results after the result obtained up to that point.  Think of the results as rows in a spreadsheet - append adds rows on the bottom.  So the example queries may produce many rows of events from the api_calls index followed by a single row with "Reason 2" at the bottom.

The trick is to get the Reason value at the bottom distributed to all of the rows above.  Here's one way.

index=api_calls
| append 
  [ inputlookup outages.csv 
  | eval t=_time
  | eval DateFromEpoch=strptime(DateFrom, "%d/%m/%Y %H:%M:%S") 
  | eval DateToEpoch=strptime(DateTo, "%d/%m/%Y %H:%M:%S")
  | where DateFromEpoch <= t and DateToEpoch >= t
  | eval deleteme=1
  | table Reason ] 
```Put the outage at the top```
| reverse
```Spread the reason to the other events```
| filldown Reason
```Restore the original order```
| reverse
```Remove the outage event```
| where NOT deleteme=1
| table _time Reason *

See my BSides 22 talk on append and appendcols (https://www.youtube.com/watch?v=m8YLT8eM1k4) for more information.

---
If this reply helps you, Karma would be appreciated.

apps_inpaytech
Explorer

The expected output would be something like:

DateOutage ReasonClientDuration(ms)Parameters
20/08/2022 Company A100blah
20/09/2022Testing 1Company A60000fred
20/10/2022Testing 2Company B60000foo
20/11/2022 Company A150goldfish

 

With Date, Client, Duration, Parameters coming from index api_calls

With this data I can filter and produce the reports the business needs.

Thanks

0 Karma

johnhuang
Motivator

Time based lookup may work. Few questions:

Can outages overlap, e.g. multiple outages.

Can we put a constraint around the max length of outages, e.g. all outages are resolved in x time?

0 Karma

apps_inpaytech
Explorer

I can ensure outages do not overlap if it's easier.

Max length would be unknown, could be for days or minutes.

0 Karma

johnhuang
Motivator

This is how you could configure a time based lookup which should be magnitudes faster than using append.

1. Create outages_time_lookup.csv from your original lookup:

| inputlookup outages.csv
| eval DateFromEpoch=strptime(DateFrom, "%d/%m/%Y %H:%M:%S")
| eval DateToEpoch=strptime(DateTo, "%d/%m/%Y %H:%M:%S")
| eval lookup_value=1
| table lookup_value DateFromEpoch DateToEpoch Reason
| outputlookup outages_time_lookup.csv

 

2. Configure a new lookup definition (use default value for anything not defined below):

Name: outages_time_lookup
Lookup file: outages_time_lookup.csv
Name of time field: DateFromEpoch
Minimum offset: 0
Advanced Options -> Maximum matches: 1

3. Sample query using the time based lookup

index=api_calls
| eval lookup_value=1
| lookup outages_time_lookup lookup_value OUTPUT DateToEpoch Reason
| eval is_outage=IF(_time<DateToEpoch, "Y", "N")
| eval outage_reason=IF(_time<DateToEpoch, Reason, "")

 

apps_inpaytech
Explorer

Genius!!

Thankyou, that worked a treat 😀

Get Updates on the Splunk Community!

Operationalizing TDIR: Building a More Resilient, Scalable SOC

Optimizing SOC workflows with a unified, risk-based approach to Threat Detection, Investigation, and Response ...

Almost Too Eventful Assurance: Part 1

Modern IT and Network teams still struggle with too many alerts and isolating issues before they are notified. ...

Demo Day: Strengthen Your SOC with Splunk Enterprise Security 8.1

Today’s threat landscape is more complex than ever. Security operation centers (SOCs) are overwhelmed with ...