Getting Data In

How to query a universal forwarder about the transfer status of monitored files?

cwacha
Path Finder

We are monitoring many files with the UF using the [monitor] stanza. For housekeeping reasons we need to delete the files every now and then. Now deleting a file that was not already fully transferred to the indexer would lead to data loss.

How can we find out which files have already been transferred by the UF?

Of course the UF itself does not know that a file is already complete or closed. It only sees that there was not more data added since some time. So what we need is some kind of list that shows the filenames being monitored as well as the file pointer up to what amount was already transferred.

example output:

/var/adm/messages 1233122
/var/adm/authlog 1233

Based on that our cleanup mechanism could then compare the actual size of a closed file with the size reported in the output above. If they are equal it knows that the UF has transferred everything and we can delete the file.

An alternative would also be to provide an "auto-close" function in the UF. This function would trigger the execution of a shell command (specified in inputs.conf) as soon as the file's size has not changed for "auto-close-seconds" (also specified in intputs.conf).

0 Karma

ahattrell_splun
Splunk Employee
Splunk Employee

The following command is very useful:

./splunk _internal call /services/admin/inputstatus/TailingProcessor:FileStatus

It lists all files monitored (and rejected for whatever reason), their size, how far
splunk has got through the file so far.

Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...