Reporting

Are scheduled searches intentionally set to be owned by "nobody" when permissions are changed from private to shared?

hoiby
Explorer

I'm on 6.1.2 -- I am troubleshooting some search quota issues and I think this is the root of the problem. I have some scheduled searches that are shared with other roles which run with an owner of "nobody", unless I change the permissions to private, in which case my user name is listed as the owner. I am getting this information from the scheduler log via this search:

index=_internal sourcetype=scheduler | stats count by savedsearch_id

Running the searches as "nobody" results in some skipped searches, as "nobody" observes the default srchJobsQuota = 3 in authorize.conf
This means that in order to ensure the searches don't get skipped AND share the results of a regularly scheduled search, I need to either increase the default srchJobsQuota (which I don't want to do), or manually assign an owner with an appropriate quota to a search in savedsearches.conf every time I create a scheduled search with these requirements. Is this the intended behavior?

Possibly related symptoms - I have also noticed that sometimes scheduled searches are automatically set with an expiration date as "saved" (even though I had not manually saved them or even viewed the search, I just let the scheduler start it, and observed it in the "Jobs" view) and other times they expire wildly early, not observing the default dispatch.ttl=2p (while other scheduled searches clearly do respect the setting). If this ttl behavior does not seem relevant, please ignore it and I will ask about it in another question.

1 Solution

hoiby
Explorer

I found the source of all the weird behavior and will document it here for reference. First off, I had made an assumption that the savedsearch_id from the scheduler log identified the user that the search was run as (e.g. savedsearch_id="nobody;app;searchname" was run as "nobody"), but when attempting to return the artifact using the loadjob command, I would have to use savedsearch="my.user;app;searchname" which led me to believe the search was actually being run by my user, just logged funky (there is a separate "user" field in the scheduler logs that correlates with this thought). It turns out my question was based on misinformation.

Secondly, I was using an asterisk to define the "minute" field for my cron jobs (according to a reference I found on the nets, which explained "0" meant "every") - turns out it was causing the search to run every minute for the hour, which set the ttl to like 2 minutes, and logged a ton of "skipped" searches (because the search was already running). The scheduled searches that were set as "saved" were because I was forcing them to run as I was troubleshooting and had over-written the asterisk in the Cron settings with an explicit minute. Lots of problem solving only to find out that this one was a PIBKAC.

View solution in original post

hoiby
Explorer

I found the source of all the weird behavior and will document it here for reference. First off, I had made an assumption that the savedsearch_id from the scheduler log identified the user that the search was run as (e.g. savedsearch_id="nobody;app;searchname" was run as "nobody"), but when attempting to return the artifact using the loadjob command, I would have to use savedsearch="my.user;app;searchname" which led me to believe the search was actually being run by my user, just logged funky (there is a separate "user" field in the scheduler logs that correlates with this thought). It turns out my question was based on misinformation.

Secondly, I was using an asterisk to define the "minute" field for my cron jobs (according to a reference I found on the nets, which explained "0" meant "every") - turns out it was causing the search to run every minute for the hour, which set the ttl to like 2 minutes, and logged a ton of "skipped" searches (because the search was already running). The scheduled searches that were set as "saved" were because I was forcing them to run as I was troubleshooting and had over-written the asterisk in the Cron settings with an explicit minute. Lots of problem solving only to find out that this one was a PIBKAC.

Get Updates on the Splunk Community!

Routing Data to Different Splunk Indexes in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. The OpenTelemetry project is the second largest ...

Getting Started with AIOps: Event Correlation Basics and Alert Storm Detection in ...

Getting Started with AIOps:Event Correlation Basics and Alert Storm Detection in Splunk IT Service ...

Register to Attend BSides SPL 2022 - It's all Happening October 18!

Join like-minded individuals for technical sessions on everything Splunk!  This is a community-led and run ...