- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm trying to determine the architecture options for automatically ingesting data into Splunk, i.e I place data in a folder -> run a script -> supply an index name -> supply a sourcetype -> script gets data into new index somehow.
Now, you may be asking why I don't just create a monitoring directory and make a script to dump data there. That's because each set of data will be of a different data/sourcetype and for a new index. Basically, think of taking in random, one-off sets of data that need different indexes all the time for analysis, NOT a log source from one machine constantly being pumped in.
So, with that said I see my options as:
- Script the creation of monitoring directories for each new index, as well as the API calls to set inputs.conf stanzas, while scp'ing the data into the monitoring directory.
- Use the API to directly send data from anywhere
- TCP input???
- ???
For #2, I don't know if this works for our instance because I've seen it only send to one single index in the past instead of load balancing. Are there any other issues ingesting large amounts of data with the API?
Any other possibilities here?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

@thisissplunk there are multiple options.
1) Route and Filter Data using transforms.conf:
Contrary to your assumption, with Splunk you can monitor a Splunk folder and have transformations defined on your sourcetype to re-route data for specific index with a new sourcetype. Refer to following documentation:
http://docs.splunk.com/Documentation/Splunk/latest/Data/Advancedsourcetypeoverrides http://docs.splunk.com/Documentation/Splunk/latest/Forwarding/Routeandfilterdatad
PS: You would definitely need to test re-routing in non prod environment first. This should be quite easy to implement for Splunk Admins.
2) Scripted Input to Splunk
Refer to following resources to setup scripted input to Splunk.
https://docs.splunk.com/Documentation/Splunk/latest/AdvancedDev/ScriptSetup
https://sublimerobots.com/2017/01/simple-splunk-scripted-input-example/
3) HTTP Event Collector
HTTP Event Collector can be configured so that your Application/s can directly communicate with Splunk without having to create a log first or scripted input.
http://docs.splunk.com/Documentation/Splunk/latest/Data/AboutHEC
http://dev.splunk.com/view/event-collector/SP-CAAAE6M
| makeresults | eval message= "Happy Splunking!!!"
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

@thisissplunk there are multiple options.
1) Route and Filter Data using transforms.conf:
Contrary to your assumption, with Splunk you can monitor a Splunk folder and have transformations defined on your sourcetype to re-route data for specific index with a new sourcetype. Refer to following documentation:
http://docs.splunk.com/Documentation/Splunk/latest/Data/Advancedsourcetypeoverrides http://docs.splunk.com/Documentation/Splunk/latest/Forwarding/Routeandfilterdatad
PS: You would definitely need to test re-routing in non prod environment first. This should be quite easy to implement for Splunk Admins.
2) Scripted Input to Splunk
Refer to following resources to setup scripted input to Splunk.
https://docs.splunk.com/Documentation/Splunk/latest/AdvancedDev/ScriptSetup
https://sublimerobots.com/2017/01/simple-splunk-scripted-input-example/
3) HTTP Event Collector
HTTP Event Collector can be configured so that your Application/s can directly communicate with Splunk without having to create a log first or scripted input.
http://docs.splunk.com/Documentation/Splunk/latest/Data/AboutHEC
http://dev.splunk.com/view/event-collector/SP-CAAAE6M
| makeresults | eval message= "Happy Splunking!!!"
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Btw, I don't see the REST API listed here. Why is that? Is the HTTP Event Collector a better option? Can you even use the REST API to send data to multiple indexers?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

@thisissplunk, sorry I missed REST API but that can be implemented as well: https://www.splunk.com/blog/2016/05/11/splunking-continuous-rest-data.html
I would recommend HTTP Event Collector however you might have to reach out implementation requirements as well.
| makeresults | eval message= "Happy Splunking!!!"
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you, I'll review soon.
