Splunk Search

How to utilize wildcards in time fields in lookup file?

TBH0
Explorer

I have a lookup which has a field with time values (in 24 hr time; i.e. 00:30, 13:45, 23:15), which tells my dashboard the scheduled start time of jobs. I have a number of jobs which are set to run hourly, and as such need to have every hour as their start time (XX:00). I've tried adding wildcard functionality to the desired fields in the lookup definition like this: 

WILDCARD(Field_Name_1),WILDCARD(Field_Name_2)

This has unfortunately not worked as I'd hoped, though, and it does not allow the wildcards to be every number when searching for all jobs which are set to run this hour. 

 

Any ideas on how I can best implement this within the lookup? 

Labels (1)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

Things are simpler if match is not involved.  Even so, an illustration of the lookup data would help others.  And I still think HH:MM may not be the best representation, especially because your actual schedule runs mixed intervals.  Let me first demonstrate a quick solution with HH:MM.  As I supposed, we'll call this field start_time, and assume that another field is called job.

 

| inputlookup dailyschedule
| rex field=start_time "(?<job_hour>[^:]+):(?<job_minute>[^:]+)"
| eval hod = mvrange(0,24)
| eval hod = mvmap(hod, if(hod < 10, "0".hod, hod))
| mvexpand hod
| eval job = if(job_hour == hod, job, null())
| stats values(job) as jobs by hod

 

This gives you something like

hodjobs
00 
01 
02 
03 
04 
05
job1
job3
06 
07 
08job2
09 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 

Is this what you expect?

Now, suppose your mixed schedule is represented in a crontab format, like

minutehourdommonthdowjob
0508***job1
1513***job2
4015***job3
00****job4
4500013,6,9,12*job5
0012**0job6
3016***job7
000015**job8

Then, you can display all jobs in a given hour no matter its interval.  For simplicity, say you still want to only display the daily jobs, you can do

 

| inputlookup samplecron
| where dow=="*" AND dom=="*" AND month=="*"
| fields hour job
| eval hod = mvrange(0,24)
| eval hod = mvmap(hod, if(hod < 10, "0".hod, hod))
| mvexpand hod
| eval job = if(hour == "*" OR hour == hod, job, null())
| stats values(job) as jobs by hod

 

The output will be similar, but will also include hourly job4, like

hodjobs
00job4
01job4
02job4
03job4
04job4
05job4
06job4
07job4
08
job1
job4
09job4
10job4
11job4
12job4
13
job2
job4
14job4
15
job3
job4
16
job4
job7
17job4
18job4
19job4
20job4
21job4
22job4
23job4

 

View solution in original post

0 Karma

yuanliu
SplunkTrust
SplunkTrust

I cannot quite understand the use case.  Can you elaborate?  I can see that one field in lookup has the HH:MM format. (Let's call it start_time.)  What is the data like for the jobs?  Is the requirement to match those jobs' actual _time with that start_time field, but only match to the hour?

If this is the requirement, wildcard may not be the best solution.  Do you have flexibility in the lookup's design?  For example, can the start_time field be split into two fields, start_hour, start_minute?  If so, all you need to do is to lookup start_hour, like

| eval job_hour = strftime(_time, "%H")
| lookup mylookup start_hour AS job_hour

That will be the easiest in my opinion.

 

0 Karma

TBH0
Explorer

Sorry for the confusion, in this query I am only looking at the lookup file to read out details about a schedule onto the dashboard. The difficulty I have is that the schedule contains jobs with irregular schedules (hourly, weekly, biweekly, monthly, etc.) and those are a bit more difficult to show in a 'scheduled to run today' panel on a dashboard as they aren't just 'XX:XX' daily. So for the ones which are every hour, I need a way to present those to Splunk in such a way that shows the jobs in every hour regardless of which timeframe I select.

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Things are simpler if match is not involved.  Even so, an illustration of the lookup data would help others.  And I still think HH:MM may not be the best representation, especially because your actual schedule runs mixed intervals.  Let me first demonstrate a quick solution with HH:MM.  As I supposed, we'll call this field start_time, and assume that another field is called job.

 

| inputlookup dailyschedule
| rex field=start_time "(?<job_hour>[^:]+):(?<job_minute>[^:]+)"
| eval hod = mvrange(0,24)
| eval hod = mvmap(hod, if(hod < 10, "0".hod, hod))
| mvexpand hod
| eval job = if(job_hour == hod, job, null())
| stats values(job) as jobs by hod

 

This gives you something like

hodjobs
00 
01 
02 
03 
04 
05
job1
job3
06 
07 
08job2
09 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 

Is this what you expect?

Now, suppose your mixed schedule is represented in a crontab format, like

minutehourdommonthdowjob
0508***job1
1513***job2
4015***job3
00****job4
4500013,6,9,12*job5
0012**0job6
3016***job7
000015**job8

Then, you can display all jobs in a given hour no matter its interval.  For simplicity, say you still want to only display the daily jobs, you can do

 

| inputlookup samplecron
| where dow=="*" AND dom=="*" AND month=="*"
| fields hour job
| eval hod = mvrange(0,24)
| eval hod = mvmap(hod, if(hod < 10, "0".hod, hod))
| mvexpand hod
| eval job = if(hour == "*" OR hour == hod, job, null())
| stats values(job) as jobs by hod

 

The output will be similar, but will also include hourly job4, like

hodjobs
00job4
01job4
02job4
03job4
04job4
05job4
06job4
07job4
08
job1
job4
09job4
10job4
11job4
12job4
13
job2
job4
14job4
15
job3
job4
16
job4
job7
17job4
18job4
19job4
20job4
21job4
22job4
23job4

 

0 Karma

TBH0
Explorer

The cron format is something I had not known about, but may be the more efficient way of doing what I'm doing. The Job and Start_Time columns are essentially as you have it, just with supporting details about the job which is also shown in the dashboard (and is also joined to an alert subsearch for certain panels). 

 

The way I am utilizing this currently is actually even simpler than you had predicted: I'm just comparing the current time (or time range selected) to the HH:MM presented in the lookup, then showing all jobs which are <current time, >selected time, so it's been a fairly simplistic approach to it overall.

 

I don't have any experience with making anything in the cron format, but would this support having multiple jobs in the same time slot? How would you handle that as far as a .csv goes, just multiple rows with the same cron values and different job names?

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Indeed if you only compare with the current hour, it's hugely simpler,

| inputlookup dailyschedule
| where strftime(now(), "%H") == mvindex(split(start_time, ":"), 0)

Crontab format is very flexible.  Yes, you can have two jobs at the same time.  For statistical purposes, it is easier if job names are different, but cron jobs do not even require that.

Remember it is just one possible representation.  If you have a different method to represent mixed schedule,  you can still use SPL to "translate" it for display.  If the task is to perform a match in search, on the other hand, certain representations could be easier than others.

0 Karma

TBH0
Explorer

I appreciate that insight, I had not heard of the cron format and I think that's going to make this schedule a lot easier to work with. If I'm understanding correctly, I will try to convert the scheduled and late time columns to a cron format, then translate back to a readable format when displaying in a table. 

 

I'll do more research into this, but I appreciate you!

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...