We have several searches that we run and have a manual backend process to load that data to each endpoint (100+ endpoints). I want to be able to schedule this custom search command to run daily and be able to have an editable list of 100+ endpoints to pass in to the search. Is this possible to do within Splunk?
Would need more details before we can comment on it's feasibility within Splunk. What process do you run manually now and how it's loading data to endpoint(s)? Does the output of search decides which endpoint to send data to OR there is static mapping?
I need to pass these endpoints into the search itself, so something I'm considering is setting up a macro for the search that requires variables. I can put that macro in the searches but then my only question is how do I schedule a search with macros that have variables in them and pass in a bunch of different variables to run the search a bunch of times.
What I'm running now is a script on a different server that runs these searches one by one using the logic in my python scripts and then pushes the returned data to the endpoints I've specified in the script. My main goal is to migrate this hacky solution to live in Splunk so that we don't have to pass this massive load of data through an extra server.
Can those variable exists a lookup table? If yes then your scheduled search can loop through lookup table rows, run a search using map command and then call a script OR modular input as alert action.
That sounds like it could work, but being fairly large datasets I'm worried that map at that scale would crash our instance. Is there any way to load balance that so that it's not trying to run too much at once?
How many different searches you run using current script? How much data/records is processed by each search? Would you mind sharing sample queries and highlight which part is dynamically passed by script?
I currently run 15 searches ranging from thousands of results per endpoint per search to a couple million results per endpoint per result, and I run that for 115+ endpoints and that needs to be able to scale relatively indefinitely.
An example would be a search like
index=core "user login"
[| inputlookup user where customerGuid=xxxxxxxxx
| return 100000 userGuid]
| stats latest(login) by userGuid host _time
And I would want to run it with 115+ different values for xxxxxxxxx and custom_endpoint