Multiple uses of the same lookup table - how to si...

pm771 · ‎07-13-2020

I have a stream of events that have names and each name belongs to a certain category.

For this example, it will be two category: "24x7" and "custom".

There are 2 lookup tables: NoEventDates (aka Holiday table) and ZeroEvents.

ZeroEvents table has subset of all possible event names with additional parameters:

Category	event	HourFrom	HourTo	HolidaysOff	DaysOfWeekOff
custom	Event11	11	12	Y
custom	Event12			N	0,6
custom	Event13			N
custom	Event14			Y	5.6
24X7	Event21	0	24
24X7	Event22	0	24
24X7	Event23	0	24

"24X7" events are expected within every 15-min all day long without holidays or weekends.
Custom event can have days of year (holidays) and/or days of week (such as weekend) when no events are expected.
Every day a custom event is expected it would come for sure only during the specific time range.

The task is to discover "missing" events situation as quickly as possible.

Custom events will be monitored every 15-min by sliding 2-hour window within their prescribed hours.

For "24X7" I have the following query:

index=...

[| inputlookup ZeroEvents.csv | where Category="24X7" | fields event | format]
| stats count as eventscount by event
| append [| inputlookup ZeroEvents.csv | search DeliveryMethod="24X7" | fields event | eval eventscount=0 ]
| stats sum(eventscount) as total by event
| where total < 1
| stats count as number
| eval NetcoolTitle=number + " 24X7 events with no messages"

I did not need to use Holiday table for that case.

For custom events it gets more complicated and I'm stuck trying to find a way not to repeat all conditions twice.

Here's is the structure with one part of the "append query" hard-coded:

index=...
 
	[| inputlookup ZeroEvents.csv | where DeliveryMethod="Batch"  | fields event| format]
 | eval date=strftime(_time,"%Y-%m-%d")
 | lookup NoEventDates.csv NEDate as date OUTPUT NEDate as Holiday
 | eval Holiday=if(isnull(Holiday), "N", "Y")
 | eval DOW=strftime(_time,"%w") 
 | eval currentHour=strftime(now(), "%H")
 | lookup ZeroEvents.csv event OUTPUT HolidaysOff DaysOfWeek HourFrom HourTo
 | where NOT match(DaysOfWeek, DOW) AND (Holiday="N" OR HolidaysOff="N") AND currentHour >= HourFrom AND currentHour <= HourTo 
 | stats count as eventscount by topic
 | append [| inputlookup ZeroEvents.csv | eval DOW="0", Holiday="N", currentHour=1 | where DeliveryMethod="Batch" AND NOT match(DaysOfWeek, DOW) AND (Holiday="N" OR HolidaysOff="N") AND currentHour >= HourFrom AND currentHour <= HourTo | eval eventscount=0 | fields topic eventscount]
 | stats sum(eventscount) as total by events
 | where total < 1

As I mentioned, `eval DOW="0", Holiday="N", currentHour=1` should be either recalculated using the same logic or I need somehow to use variables from the outer scope.

Is there a simpler way to write such lookup-based queries?

Is there a solution without a massive code duplication for "custom" events?

pm771 · ‎07-16-2020

[Answering my own question. Any ideas for improvement are welcome]

 index=...
 
	[| inputlookup ZeroEvents.csv | where Category="custom"  | fields event| format]
 | stats count as eventscount by event	
 | append [
            | inputlookup ZeroEvents.csv  
			| eval currentdate=strftime(now(),"%Y-%m-%d") 
			| lookup NoEventDates.csv NEDate as date OUTPUT NEDate as Holiday
			| eval Holiday=if(isnull(Holiday), "N", "Y"), DOW=strftime(now(), "%w"), currentHour=strftime(now(), "%H")
			| where Category="custom" AND NOT match(DaysOfWeek, DOW) AND (Holiday="N" OR HolidaysOff="N") AND currentHour >= HourFrom AND currentHour <= HourTo 
            | fields event
            | eval eventscount=0 			
          ]
 | stats sum(eventscount) as total by events
 | where total < 1

The main change is that almost all filtering is done on lookup table itself.

Advantages:

DRY principle is obeyed (almost)
Main lookup logic is isolated and can be debugged on its own

In order to test it's necessary to replace 4 variables with desired hard-coded values.

Due to late filtering the query will perform extra work, but its performance is still acceptable.

Multiple uses of the same lookup table - how to simplify?

count

lookup

stats

Index This | Why did the turkey cross the road?

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Feel the Splunk Love: Real Stories from Real Customers

Are you a member of the Splunk Community?