Getting Data In

Does useACK=true in inputs.conf [batch://] stanza ensure that the file will be indexed BEFORE being deleted?

Contributor

I have an application which writes .json files into a directory. I would like to be able to monitor the directory and forward all files to the indexers. The files are written once, and never updated, so I don't need to monitor the file for changes, just make sure that any new files added to the directory are forwarded for indexing. The size of the files will vary, they can be anywhere between 20K and several MB in size.

I know that I can use the [monitor:] input to do this, but it will not clean up the files. I see that the [batch:] input will cleanup the files, but I'm unclear if batch monitors the directory for new files as well. If it does, will adding useACK=true to the stanza guarantee that the file will not be deleted until the ACK is received from the indexer? My stanza in inputs.conf would look like this (I think):

[batch:///ingest/data]
index=foo
sourcetype=foo_json
useACK=true
move_policy=sinkhole

0 Karma

SplunkTrust
SplunkTrust

Hello lyndac,

Right, to answer:

useACK (Using Ackknowledge)

Check this:
http://docs.splunk.com/Documentation/Splunk/latest/Forwarding/Protectagainstlossofin-flightdata

Activating Ack is global to your forwarder, not specific to a file monitor input. (so in outputs.conf, not inputs.conf)

batch mode and recursive scan

The batch mode works totally the same the standard monitor does, so yes you can watch for files recursively. (it will not delete directories thought, only monitored files)

File deletion

The Splunk instance will delete the file when it entirely filled in queues or forwarded (as far as i know), it has nothing to see directly with Ack activation.

Then using Ack will ensure that each piece of data will be successfully received and indexed by remote indexers, a bit like TCP versus UDP does on the Network layer with network packets.

Guilhem

0 Karma

Contributor

So when I set useACK=true, Splunk will make a copy of the file in it's wait queue when it sends it. It will then delete the original file (because move_policy=sinkhole).

Then, when it receives the ACK from the indexer, it deletes the copy from the wait queue and life goes on. If it doesn't receive the ACK, it will try to resend the copy of the file from the wait queue?

Is that correct? I just want to make sure I'm understanding the flow...

0 Karma

Contributor

hi @lyndac

I am in exact similar situation ? were you able to identify and make it working successfully with batch input. Just want to make sure, the file is completely indexed before deletion and newer files keep getting created in the directory with application requests, so they need to be monitored, forwarded and deleted.

Thanks

0 Karma