Solved: How to create a scheduled job time to find the run...

moorvogi · ‎09-11-2018

I'm working w/ a similar issue as: https://answers.splunk.com/answers/512103/how-to-get-a-list-of-schedules-searches-reports-al.html

The addendum to that is I want to find the run time of each of the searches. I'm thinking perhaps there are too many searches running at the same time and is causing Splunk inner-connectivity issues.

It would be really nice to have a scheduled job time and the amount of time it took to run the last time (or several times).

efavreau · ‎09-11-2018

In that question, they look at the rest api. However, timings can be found in index=_audit. Depending on what your exact criteria is, you may want to join two searches. Below I demonstrate the timings are in _audit:

index=_audit savedsearch_name=* savedsearch_name!="" timestamp=* total_run_time=*
| eval temp=strptime(total_run_time,"%Y%m%d%H%M%S") 
| convert timeformat="%m-%d-%Y %H:%M:%S" ctime(temp)
| table timestamp total_run_time savedsearch_name
| sort - timestamp

For more information, this was cobbled together from:
https://answers.splunk.com/answers/507790/index-audit-contents.html
https://answers.splunk.com/answers/39402/convert-timeformat.html

###

If this reply helps you, an upvote would be appreciated.

View solution in original post

zonistj · ‎09-11-2018

Good morning,

I think this SPL does what you're asking to do. It's similar to searches built into the monitoring console but more specifically tailored to your requirements.

index=_internal sourcetype=scheduler | stats values(app) AS splunk_app, values(scheduled_time) AS scheduled_time, values(dispatch_time) AS dispatch_time, values(result_count) AS result_count, values(search_type) AS search_type, values(status) AS status, values(run_time) AS run_time by sid | convert ctime(scheduled_time) AS scheduled_time_pretty, ctime(dispatch_time) AS dispatch_time_pretty | eval schedule_dispatch_delta = dispatch_time-scheduled_time, schedule_dispatch_delta_pretty = tostring(schedule_dispatch_delta,"duration") | table sid, splunk_app,status,search_type,result_count,run_time,scheduled_time_pretty,dispatch_time_pretty,schedule_dispatch_delta_pretty | sort - run_time

This search uses the _internal index and the scheduled sourcetype to pull meta information about your scheduled searches. It specifically focuses on scheduling and run-time of the searches and works to identify searches that are struggling.

I recommend using this as a starting place then investigating further by adding |stats count by FIELDNAME based on fields you want to investigate. For example, adding |stats count by scheduled_time_pretty will give you a count of searches based on the times they are scheduled to run. That can help you identify if you have too many searches scheduled at the same time.

efavreau · ‎09-11-2018

In that question, they look at the rest api. However, timings can be found in index=_audit. Depending on what your exact criteria is, you may want to join two searches. Below I demonstrate the timings are in _audit:

index=_audit savedsearch_name=* savedsearch_name!="" timestamp=* total_run_time=*
| eval temp=strptime(total_run_time,"%Y%m%d%H%M%S") 
| convert timeformat="%m-%d-%Y %H:%M:%S" ctime(temp)
| table timestamp total_run_time savedsearch_name
| sort - timestamp

For more information, this was cobbled together from:
https://answers.splunk.com/answers/507790/index-audit-contents.html
https://answers.splunk.com/answers/39402/convert-timeformat.html

###

If this reply helps you, an upvote would be appreciated.

moorvogi · ‎09-11-2018

i'll take this answer as it's exactly what i was looking for! I do have a follow up though, what is the "total run time" value if it's "*" in the resultset?

example: i assume records show run time in seconds and they are 5, 6, 15, 400 and * what's the value of "*" in the output result set?

efavreau · ‎09-11-2018

The * is to say we want something in there - not null. The details I found on total_run_time were for the history command: "The total time it took to run the search in seconds." Source: http://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/History

###

If this reply helps you, an upvote would be appreciated.

diogofgm · ‎09-11-2018

If you are trying to troubleshoot scheduled search concurrency why not use the monitoring console? check "Search >> Scheduler Activity: Instance". You can get alot of information there (inc. the average runtime for the searches).

------------
Hope I was able to help you. If so, some karma would be appreciated.

moorvogi · ‎09-11-2018

I'll double check but i don't think it has the info i'm looking for. In short, we're seeing connection issues at waht appears to be random times/intervals. Knowing "nothing is random" there's a pattern somewhere so "average" isn't going to give me the info i think i need. HAHAH notice i said think i need, not sure it'll answer what i'm looking for.

Other answers posts indicate that it's likely due to a query timeout in the configs. We've more than doubled the default timeouts but i'm still thinking it's bottle-necked somewhere. We can run the same query that times out a couple minutes later and it's fine.

How to create a scheduled job time to find the run time of each of the searches?

Stay Connected: Your Guide to November Tech Talks, Office Hours, and Webinars!

Transform your security operations with Splunk Enterprise Security

Splunk Admins and App Developers | Earn a $35 gift card!