Splunk Search

How to determine if event is in list of outages?

apps_inpaytech
Explorer

I have a seemingly simple request: list the events and indicate if it occurred during an outage.

I have been trying for ages and I cannot get it to work, can anyone please help?

Base search for events: 

index=api_calls

CSV lookup to record the outage windows, called 'outages.csv' (UK style dates):

DateFrom DateTo Reason
01/09/2022 09:00:00  30/09/2022 23:00:00 Testing 1
01/10/2022 09:00:00 31/10/2022 09:00:00 Testing 2

 

This produces the correct outage row:

| inputlookup outages.csv 
| eval t=now()
| eval DateFromEpoch=strptime(DateFrom, "%d/%m/%Y %H:%M:%S")
| eval DateToEpoch=strptime(DateTo, "%d/%m/%Y %H:%M:%S")
| where DateFromEpoch <= t and DateToEpoch >= t
| table Reason

Output is: Testing 2

I would have expected this to add the Reason field to the base results:

index=api_calls
| append
[ inputlookup outages.csv
| eval t=_time
| eval DateFromEpoch=strptime(DateFrom, "%d/%m/%Y %H:%M:%S")
| eval DateToEpoch=strptime(DateTo, "%d/%m/%Y %H:%M:%S")
| where DateFromEpoch <= t and DateToEpoch >= t
| table Reason ]
| table _time Reason *

But for some reason I cannot get anything to add to the search, not even

index=api_calls
| append [ | makeresults
  | eval Reason="hello" | table Reason ]
| table _time Reason *

Ideally, I would like this to be as a macro so I can re-use it easily:

index=api_calls
| `is_outage(_time)`
| table _time Reason *

I'm doing something wrong, any help appreciated.

Labels (1)
0 Karma
1 Solution

johnhuang
Motivator

This is how you could configure a time based lookup which should be magnitudes faster than using append.

1. Create outages_time_lookup.csv from your original lookup:

| inputlookup outages.csv
| eval DateFromEpoch=strptime(DateFrom, "%d/%m/%Y %H:%M:%S")
| eval DateToEpoch=strptime(DateTo, "%d/%m/%Y %H:%M:%S")
| eval lookup_value=1
| table lookup_value DateFromEpoch DateToEpoch Reason
| outputlookup outages_time_lookup.csv

 

2. Configure a new lookup definition (use default value for anything not defined below):

Name: outages_time_lookup
Lookup file: outages_time_lookup.csv
Name of time field: DateFromEpoch
Minimum offset: 0
Advanced Options -> Maximum matches: 1

3. Sample query using the time based lookup

index=api_calls
| eval lookup_value=1
| lookup outages_time_lookup lookup_value OUTPUT DateToEpoch Reason
| eval is_outage=IF(_time<DateToEpoch, "Y", "N")
| eval outage_reason=IF(_time<DateToEpoch, Reason, "")

 

View solution in original post

richgalloway
SplunkTrust
SplunkTrust

It would help to see a sample expected output, but it looks like you've omitted a key step.

The append command runs a search and puts the results after the result obtained up to that point.  Think of the results as rows in a spreadsheet - append adds rows on the bottom.  So the example queries may produce many rows of events from the api_calls index followed by a single row with "Reason 2" at the bottom.

The trick is to get the Reason value at the bottom distributed to all of the rows above.  Here's one way.

index=api_calls
| append 
  [ inputlookup outages.csv 
  | eval t=_time
  | eval DateFromEpoch=strptime(DateFrom, "%d/%m/%Y %H:%M:%S") 
  | eval DateToEpoch=strptime(DateTo, "%d/%m/%Y %H:%M:%S")
  | where DateFromEpoch <= t and DateToEpoch >= t
  | eval deleteme=1
  | table Reason ] 
```Put the outage at the top```
| reverse
```Spread the reason to the other events```
| filldown Reason
```Restore the original order```
| reverse
```Remove the outage event```
| where NOT deleteme=1
| table _time Reason *

See my BSides 22 talk on append and appendcols (https://www.youtube.com/watch?v=m8YLT8eM1k4) for more information.

---
If this reply helps you, Karma would be appreciated.

apps_inpaytech
Explorer

The expected output would be something like:

DateOutage ReasonClientDuration(ms)Parameters
20/08/2022 Company A100blah
20/09/2022Testing 1Company A60000fred
20/10/2022Testing 2Company B60000foo
20/11/2022 Company A150goldfish

 

With Date, Client, Duration, Parameters coming from index api_calls

With this data I can filter and produce the reports the business needs.

Thanks

0 Karma

johnhuang
Motivator

Time based lookup may work. Few questions:

Can outages overlap, e.g. multiple outages.

Can we put a constraint around the max length of outages, e.g. all outages are resolved in x time?

0 Karma

apps_inpaytech
Explorer

I can ensure outages do not overlap if it's easier.

Max length would be unknown, could be for days or minutes.

0 Karma

johnhuang
Motivator

This is how you could configure a time based lookup which should be magnitudes faster than using append.

1. Create outages_time_lookup.csv from your original lookup:

| inputlookup outages.csv
| eval DateFromEpoch=strptime(DateFrom, "%d/%m/%Y %H:%M:%S")
| eval DateToEpoch=strptime(DateTo, "%d/%m/%Y %H:%M:%S")
| eval lookup_value=1
| table lookup_value DateFromEpoch DateToEpoch Reason
| outputlookup outages_time_lookup.csv

 

2. Configure a new lookup definition (use default value for anything not defined below):

Name: outages_time_lookup
Lookup file: outages_time_lookup.csv
Name of time field: DateFromEpoch
Minimum offset: 0
Advanced Options -> Maximum matches: 1

3. Sample query using the time based lookup

index=api_calls
| eval lookup_value=1
| lookup outages_time_lookup lookup_value OUTPUT DateToEpoch Reason
| eval is_outage=IF(_time<DateToEpoch, "Y", "N")
| eval outage_reason=IF(_time<DateToEpoch, Reason, "")

 

apps_inpaytech
Explorer

Genius!!

Thankyou, that worked a treat 😀

Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...