I maintain an app with a data input wizard, under the hood of which is a custom controller that can list and create normal "monitor" data inputs very reliably. I now need to expand this to listing and creating "batch" data inputs aka "sinkhole" inputs, and from what I'm seeing maybe this isn't even possible.
The REST API Path for normal monitor inputs is /data/inputs/monitor
and in inputs.conf these look like:
[monitor://D:\some\path\*]
index = foo
sourcetype = bar
I now need to be able to also list and create "batch" data inputs. In inputs.conf these end up in the form:
[batch://D:\some\path\*]
move_policy = sinkhole
index = foo
sourcetype = bar
I cannot use a oneshot input. In this particular use case, files are ftp'ed to the given directory every minute or so in real time and I want to delete them as they are indexed. Hence batch is perfect.
1) Listing batch inputs:
There seems to be no separate endpoint for these, but they do get listed (oddly) in the output results for /data/inputs/monitor.
So, the only way I've found to list them, is to request /data/inputs/monitor
and then look for the "move_policy" key being set to sinkhole.
Q1: Is there a better way?
2) Creating batch inputs:
I have not found any way to do this. There is no /data/inputs/batch
or /data/inputs/sinkhole
endpoint and thus no /data/inputs/batch/_new
to POST to.
Following the weirdness of #1 above, if I use the entity code to create a "_new" monitor input, and I sneak in a "move_policy" key set to "sinkhole", that doesn't work either - it just complains
Argument "move_policy" is not supported by this handler.
Q2: Am I missing something? Is there a way to actually do what I need?
Eager for any help or any advice. If I have to conclude that such a simple data input administration task is impossible, I'll be sad and I'll have to just resort to doing filesystem operations to write the conf files directly through popen, and then prompt the user to restart the server afterwards.
UPDATED. Yes it is possible. use the servicesNS endpoint for conf-inputs and pass "nobody" as the user. If you're used to the old /services/data/inputs/* endpoints, and you've had services!=servicesNS confusion in the past, beware that you don't want to use "/services/configs/conf-inputs" only use servicesNS/nobody/search/configs/conf-inputs
In brief, and thanks to Support for this example:
> curl -u admin:changeme -k https://127.0.0.1:8089/servicesNS/nobody/search/configs/conf-inputs -d 'name=batch:///home/work/305482/batch_fodder' -d 'move_policy=sinkhole'
> cat ./splunk/etc/apps/search/local/inputs.conf
[batch:///home/work/305482/batch_fodder]
move_policy = sinkhole
Less Interesting Other Details.
UPDATED. Yes it is possible. use the servicesNS endpoint for conf-inputs and pass "nobody" as the user. If you're used to the old /services/data/inputs/* endpoints, and you've had services!=servicesNS confusion in the past, beware that you don't want to use "/services/configs/conf-inputs" only use servicesNS/nobody/search/configs/conf-inputs
In brief, and thanks to Support for this example:
> curl -u admin:changeme -k https://127.0.0.1:8089/servicesNS/nobody/search/configs/conf-inputs -d 'name=batch:///home/work/305482/batch_fodder' -d 'move_policy=sinkhole'
> cat ./splunk/etc/apps/search/local/inputs.conf
[batch:///home/work/305482/batch_fodder]
move_policy = sinkhole
Less Interesting Other Details.
How about just having a cron job clear up these files?
That's basically the solution I have now, but since this is an app it gets set up in the field. There when they skip the step or they misread the underlying problem as being just about disk size, it becomes a problem.
There can very quickly be tens of thousands, hundreds of thousands, millions of these files in these directories and obviously Splunk doesn't react very well when it's constantly checking huge numbers of files for appended changes. Indexing falls behind, IO performance on the box becomes awful.
It's a perfect case for a sinkhole input.
I dont see a way to do batch or sinkhole when creating the input via the input rest endpoint.
I do however believe that you can use the config and properties endpoints to modify any config file:
http://docs.splunk.com/Documentation/Splunk/6.3.2/RESTREF/RESTconf
The namespace specified in the URL determines where the configuration file is created. For example, /services/properties creates the file in the $SPLUNK_BASE/etc/system/local directory and servicesNS/nobody/search creates the file in the $SPLUNK_BASE/etc/apps/search/local directory.
Update - this does work, with the following drawbacks.
1 - it requires that the user have the admin_all_objects capability, which in larger deployments is severely restricted and which even admin level users may not have.
2 - When created, (at least with Entity.py) it is created as a namespaced object, and Splunkd puts the inputs.conf file in $SPLUNK_HOME/etc/users///inputs.conf
Which isn't particularly useful, because Splunkd ignores these stanzas here entirely. The only way they can be seen by Splunk is if they're created at the app level.
I think the second problem can be worked around by crufting in a few subsequent rest calls to modify the eai:acl key. =/
But since the first problem is kind of a deal breaker I haven't gotten around to this yet.
Does anyone have another answer, or are there examples of using the Splunk Python SDK from within a custom python controller in a Splunk app? Maybe I'd have better luck using the (at least newer) Python SDK.
Actually I've had no success working around the second problem. Various combinations of setting and not-setting owner and namespace, or setting owner to "nobody", crossed with setting 'eai:acl' to "app", "global", "system" etc, all such permutations result in the data input conf being created at the user-level where Splunk ignores it, rather than at the app or system level where I need it. Fixing the sharing by modifying "eai:acl" on a subsequent rest call with Entity.py isn't possible because since splunkd ignores inputs.conf stanzas there, it doesn't see it. Therefore you get a 404 on the subsequent get call.
The only recourse seems to be doing filesystem operations directly to write the raw config and prompt for a server restart after.
did you use /servicesns/ endpoint or /services/ endpoint?
It says this with regards to where the file is created:
The namespace specified in the URL determines where the configuration file is created. For example, /services/properties creates the file in the $SPLUNK_BASE/etc/system/local directory and servicesNS/nobody/search creates the file in the $SPLUNK_BASE/etc/apps/search/local directory.
Yep. I thought of that. Using the entity class and then later using raw splunk.rest code, it doesn't matter. Submitting to the correct non-namespaced endpoint (/services/configs/conf-inputs), where it should definitely be putting the stanza into the app directory (or system directory), splunkd always puts the input stanza over in etc/users/admin/search/local/inputs.conf.
As I mentioned it's DOA and invisible because you can't have batch inputs in user conf and Splunk steadfastly ignores it.
I tried the Python SDK but it looks like (and I could be wrong), its implementation to wrap the conf-* endpoint seems to not match the actual behavior of that endpoint.
Can you save me some time replicating the issue and provide me the put/post you send to create a conf file?
The instructions are saying to use servicesNS to put it into app dir yet you say you're using services instead of servicesNS.
At the end of the day, i've never done this before but I'd love to replicate the problem and find a solution.
Of course I can also help you write python to create the file manually too...
something like this might append [stanzaName] to inputs.conf:
import subprocess
import os
cmd = '/usr/bin/echo'
string = "[stanzaName] >> /opt/splunk/etc/apps/appName/local/inputs.conf"
with open(os.devnull, 'wb') as devnull:
subprocess.check_call([cmd, string], stdout=devnull, stderr=subprocess.STDOUT)
(may need a little work on string and syntax) passed to subprocess