Solved: Is it possible to create a batch data input via th...

sideview · ‎12-18-2015

I maintain an app with a data input wizard, under the hood of which is a custom controller that can list and create normal "monitor" data inputs very reliably. I now need to expand this to listing and creating "batch" data inputs aka "sinkhole" inputs, and from what I'm seeing maybe this isn't even possible.

The REST API Path for normal monitor inputs is /data/inputs/monitor
and in inputs.conf these look like:

[monitor://D:\some\path\*]
index = foo
sourcetype = bar

I now need to be able to also list and create "batch" data inputs. In inputs.conf these end up in the form:

[batch://D:\some\path\*]
move_policy = sinkhole
index = foo
sourcetype = bar

I cannot use a oneshot input. In this particular use case, files are ftp'ed to the given directory every minute or so in real time and I want to delete them as they are indexed. Hence batch is perfect.

1) Listing batch inputs:
There seems to be no separate endpoint for these, but they do get listed (oddly) in the output results for /data/inputs/monitor.

So, the only way I've found to list them, is to request /data/inputs/monitor and then look for the "move_policy" key being set to sinkhole.

Q1: Is there a better way?

2) Creating batch inputs:
I have not found any way to do this. There is no /data/inputs/batch or /data/inputs/sinkhole endpoint and thus no /data/inputs/batch/_new to POST to.

Following the weirdness of #1 above, if I use the entity code to create a "_new" monitor input, and I sneak in a "move_policy" key set to "sinkhole", that doesn't work either - it just complains
Argument "move_policy" is not supported by this handler.

Q2: Am I missing something? Is there a way to actually do what I need?

Eager for any help or any advice. If I have to conclude that such a simple data input administration task is impossible, I'll be sad and I'll have to just resort to doing filesystem operations to write the conf files directly through popen, and then prompt the user to restart the server afterwards.

sideview · ‎01-26-2016

UPDATED. Yes it is possible. use the servicesNS endpoint for conf-inputs and pass "nobody" as the user. If you're used to the old /services/data/inputs/* endpoints, and you've had services!=servicesNS confusion in the past, beware that you don't want to use "/services/configs/conf-inputs" only use servicesNS/nobody/search/configs/conf-inputs

In brief, and thanks to Support for this example:

> curl -u admin:changeme -k https://127.0.0.1:8089/servicesNS/nobody/search/configs/conf-inputs -d 'name=batch:///home/work/305482/batch_fodder' -d 'move_policy=sinkhole'
> cat ./splunk/etc/apps/search/local/inputs.conf

[batch:///home/work/305482/batch_fodder]
move_policy = sinkhole

Less Interesting Other Details.

You can't use the actual Data Inputs endpoints. If you use the actual input endpoints and not just /configs/conf-inputs http://docs.splunk.com/Documentation/Splunk/6.2.7/RESTREF/RESTinput , it's a bit weird. Batch/sinkhole inputs will get listed out as though they were "monitor" inputs. This seems to be an unrelated bug (call it Bug #1). If the developer notices that some of the inputs are coming back with "move_policy" set to "sinkhole" (not a legal key for a real monitor input), they could deduce that this must not be an actual monitor input but rather a batch input hiding away. Anyway this inspired a lot of hope that there was some hidden secret way you could POST a new input to /services/data/inputs/monitor and end up with a batch input created but no.
/services/configs/conf-inputs This "works" but makes sure you use servicesNS and use "nobody" as the user. With some other combinations it will physically create a stanza in an inputs.conf file but it will create the stanza in etc/users/Username/Appname/local/inputs.conf, and batch inputs cannot be specified there, splunkd ignores them completely, does not run them, will never list them. Very confusing.

View solution in original post

sideview · ‎01-26-2016

UPDATED. Yes it is possible. use the servicesNS endpoint for conf-inputs and pass "nobody" as the user. If you're used to the old /services/data/inputs/* endpoints, and you've had services!=servicesNS confusion in the past, beware that you don't want to use "/services/configs/conf-inputs" only use servicesNS/nobody/search/configs/conf-inputs

In brief, and thanks to Support for this example:

> curl -u admin:changeme -k https://127.0.0.1:8089/servicesNS/nobody/search/configs/conf-inputs -d 'name=batch:///home/work/305482/batch_fodder' -d 'move_policy=sinkhole'
> cat ./splunk/etc/apps/search/local/inputs.conf

[batch:///home/work/305482/batch_fodder]
move_policy = sinkhole

Less Interesting Other Details.

You can't use the actual Data Inputs endpoints. If you use the actual input endpoints and not just /configs/conf-inputs http://docs.splunk.com/Documentation/Splunk/6.2.7/RESTREF/RESTinput , it's a bit weird. Batch/sinkhole inputs will get listed out as though they were "monitor" inputs. This seems to be an unrelated bug (call it Bug #1). If the developer notices that some of the inputs are coming back with "move_policy" set to "sinkhole" (not a legal key for a real monitor input), they could deduce that this must not be an actual monitor input but rather a batch input hiding away. Anyway this inspired a lot of hope that there was some hidden secret way you could POST a new input to /services/data/inputs/monitor and end up with a batch input created but no.
/services/configs/conf-inputs This "works" but makes sure you use servicesNS and use "nobody" as the user. With some other combinations it will physically create a stanza in an inputs.conf file but it will create the stanza in etc/users/Username/Appname/local/inputs.conf, and batch inputs cannot be specified there, splunkd ignores them completely, does not run them, will never list them. Very confusing.

eryder_splunk · ‎01-18-2016

How about just having a cron job clear up these files?

sideview · ‎01-18-2016

That's basically the solution I have now, but since this is an app it gets set up in the field. There when they skip the step or they misread the underlying problem as being just about disk size, it becomes a problem.

There can very quickly be tens of thousands, hundreds of thousands, millions of these files in these directories and obviously Splunk doesn't react very well when it's constantly checking huge numbers of files for appended changes. Indexing falls behind, IO performance on the box becomes awful.

It's a perfect case for a sinkhole input.

jkat54 · ‎12-20-2015

I dont see a way to do batch or sinkhole when creating the input via the input rest endpoint.

I do however believe that you can use the config and properties endpoints to modify any config file:

http://docs.splunk.com/Documentation/Splunk/6.3.2/RESTREF/RESTconf

The namespace specified in the URL determines where the configuration file is created. For example, /services/properties creates the file in the $SPLUNK_BASE/etc/system/local directory and servicesNS/nobody/search creates the file in the $SPLUNK_BASE/etc/apps/search/local directory.

sideview · ‎01-11-2016

Update - this does work, with the following drawbacks.

1 - it requires that the user have the admin_all_objects capability, which in larger deployments is severely restricted and which even admin level users may not have.

2 - When created, (at least with Entity.py) it is created as a namespaced object, and Splunkd puts the inputs.conf file in $SPLUNK_HOME/etc/users///inputs.conf
Which isn't particularly useful, because Splunkd ignores these stanzas here entirely. The only way they can be seen by Splunk is if they're created at the app level.

I think the second problem can be worked around by crufting in a few subsequent rest calls to modify the eai:acl key. =/
But since the first problem is kind of a deal breaker I haven't gotten around to this yet.

Does anyone have another answer, or are there examples of using the Splunk Python SDK from within a custom python controller in a Splunk app? Maybe I'd have better luck using the (at least newer) Python SDK.

sideview · ‎01-11-2016

Actually I've had no success working around the second problem. Various combinations of setting and not-setting owner and namespace, or setting owner to "nobody", crossed with setting 'eai:acl' to "app", "global", "system" etc, all such permutations result in the data input conf being created at the user-level where Splunk ignores it, rather than at the app or system level where I need it. Fixing the sharing by modifying "eai:acl" on a subsequent rest call with Entity.py isn't possible because since splunkd ignores inputs.conf stanzas there, it doesn't see it. Therefore you get a 404 on the subsequent get call.

The only recourse seems to be doing filesystem operations directly to write the raw config and prompt for a server restart after.

jkat54 · ‎01-12-2016

did you use /servicesns/ endpoint or /services/ endpoint?

It says this with regards to where the file is created:

The namespace specified in the URL determines where the configuration file is created. For example, /services/properties creates the file in the $SPLUNK_BASE/etc/system/local directory and servicesNS/nobody/search creates the file in the $SPLUNK_BASE/etc/apps/search/local directory.

sideview · ‎01-12-2016

Yep. I thought of that. Using the entity class and then later using raw splunk.rest code, it doesn't matter. Submitting to the correct non-namespaced endpoint (/services/configs/conf-inputs), where it should definitely be putting the stanza into the app directory (or system directory), splunkd always puts the input stanza over in etc/users/admin/search/local/inputs.conf.

As I mentioned it's DOA and invisible because you can't have batch inputs in user conf and Splunk steadfastly ignores it.
I tried the Python SDK but it looks like (and I could be wrong), its implementation to wrap the conf-* endpoint seems to not match the actual behavior of that endpoint.

jkat54 · ‎01-13-2016

Can you save me some time replicating the issue and provide me the put/post you send to create a conf file?

The instructions are saying to use servicesNS to put it into app dir yet you say you're using services instead of servicesNS.

At the end of the day, i've never done this before but I'd love to replicate the problem and find a solution.

jkat54 · ‎01-13-2016

Of course I can also help you write python to create the file manually too...

something like this might append [stanzaName] to inputs.conf:

import subprocess
import os

cmd = '/usr/bin/echo'
string = "[stanzaName] >> /opt/splunk/etc/apps/appName/local/inputs.conf" 

with open(os.devnull, 'wb') as devnull:
 subprocess.check_call([cmd, string], stdout=devnull, stderr=subprocess.STDOUT)

(may need a little work on string and syntax) passed to subprocess

Is it possible to create a batch data input via the REST API?

New This Month in Splunk Observability Cloud - Metrics Usage Analytics, Enhanced K8s ...

Alerting Best Practices: How to Create Good Detectors

Discover Powerful New Features in Splunk Cloud Platform: Enhanced Analytics, ...