Splunk Search

Report to show missed schedule searches duo downtime

Matthias_BY
Communicator

Hello,

i have some scheduled searches. Some run every 5 minutes, some 15 minutes some hourly etc.

Some of those searches are there to generate a summary index, a few other to exportcsv to feed it into another tool regularly.

if there is an outage of the search head (even with two search heads or SH pooling) some jobs might be skipped or missed as they won't rerun. This will result in a not complete dataset.

In the first step, i'll would need a Report which shows me, based on the actual schedule which search was skipped.

index=_internal source=*scheduler.log | eval sched = strftime(scheduled_time, "%Y-%m-%d %H:%M:%S") | search savedsearch_name="Project Honey Pot - Threatscore AVG Last 4 Hours" NOT continued | table sched status savedsearch_name

alt text

Report Like:

User Activity Search
- Last RUN | scheduled every 5 Minutes | STATUS=Completed
- Last RUN - 5 minutes | STATUS=Completed
- Last RUN -10 Minutes | STATUS=Completed
- Last RUN -15 Minutes | Status=Not Executed
- Last RUN -20 Minutes | Status-Completed

and this for each scheduled search dynamically with the scheduled every 5 minutes.

The above example would show a potential Restart of the SH 15 minutes ago. And then i can manually investigate and re-run the export for this specific timeframe to add the data again... or it can review the last 10 successfull runs - subtract the times and then automatically detect that it is running all 5 minutes.

Thanks a lot
Matthias

1 Solution

Matthias_BY
Communicator

Hi,

i found a nearly best solution including being independent from the schedule with kmeans. So first listing all successfull runs, then calculating the delta of each run and then with kmeans adding some statistical calculation to find the outlier. as if the system was off or one scheduled task was missing this can be shown:

`set_internal_index` host="mmaier-mbp15.local" source=*scheduler.log savedsearch_name="BlueCoat - Stats - Collect" status!=continued | stats max(run_time) as Max, count by _time savedsearch_name | sort -_time | delta _time as delta | where Max>0 | eval delta = round (delta/60*-1,2) | kmeans delta | sort -_time | replace 1 with OK, 2 with "ERROR" in CLUSTERNUM

alt text

Even in the line visualization i can visualize it very good:

alt text

View solution in original post

Matthias_BY
Communicator

Hi,

i found a nearly best solution including being independent from the schedule with kmeans. So first listing all successfull runs, then calculating the delta of each run and then with kmeans adding some statistical calculation to find the outlier. as if the system was off or one scheduled task was missing this can be shown:

`set_internal_index` host="mmaier-mbp15.local" source=*scheduler.log savedsearch_name="BlueCoat - Stats - Collect" status!=continued | stats max(run_time) as Max, count by _time savedsearch_name | sort -_time | delta _time as delta | where Max>0 | eval delta = round (delta/60*-1,2) | kmeans delta | sort -_time | replace 1 with OK, 2 with "ERROR" in CLUSTERNUM

alt text

Even in the line visualization i can visualize it very good:

alt text

_d_
Splunk Employee
Splunk Employee

Take a look at the SoS app under Search | Scheduler Activity.

0 Karma

_d_
Splunk Employee
Splunk Employee

Correct, and that's because the scheduler does not keep state across restarts and naturally does not log anything. I suppose that you can modify your search to include a condition that checks whether a shutdown event occurred in splunkd.log during the timerange in question and add a field to indicate so. Then use the information in this field to make a decision on whether or not to re-run the reports.

0 Karma

Matthias_BY
Communicator

thanks for your hint - but this shows only successfull run vs. errors. if i have a scheduled activity every 5 minutes and i shutdown my instance of splunk for 1 hour and start it again - i won't get it displyed that it misses several scheduled activities...

0 Karma
Get Updates on the Splunk Community!

Uncovering Multi-Account Fraud with Splunk Banking Analytics

Last month, I met with a Senior Fraud Analyst at a nationally recognized bank to discuss their recent success ...

Secure Your Future: A Deep Dive into the Compliance and Security Enhancements for the ...

What has been announced?  In the blog, “Preparing your Splunk Environment for OpensSSL3,”we announced the ...

New This Month in Splunk Observability Cloud - Synthetic Monitoring updates, UI ...

This month, we’re delivering several platform, infrastructure, application and digital experience monitoring ...