Getting Data In

Monitoring directory loaded with LFTP Mirror causes reindexing. How should I solve this?

polfer
Explorer

I'm using Splunk 6.3.2 with a simple monitor stanza in inputs.conf that watches all the *.txt files in a particular directory.

The files in this directory are loaded using a cron job that runs every 10 minutes and uses LFTP to do a mirror from the remote server's log directory. The most recent log file grows throughout the current hour, and may be pulled several times before it stabilizes and the remote server moves to a new log file.

This sets things up for a failure, because it appears that LFTP truncates the current hour's file before repulling (it doesn't seem to pull to temp and then move to destination). We're getting duplicate log lines and re-indexing along with a WatchedFile warning that states:

File too small to check seekcrc, probably truncated.  Will re-read entire file" followed by "Will begin reading at offset=0". 

I'm hoping somebody knows of a quick trick here in either Splunk (a way to delay before doing CRC check) or LFTP (some way to force a smarter mirror or use of tmp/mv vs. truncate) that will force a more custom solution. Closest thing I've seen in Answers is folks pulling to another directory and then moving the files. I can certainly do that, but it will complicate things a bit with the desire to do a mirror (I may just have to burn 2x storage and keep two copies).

Looking forward to some been-there-done-that experience that resulted in a clean/efficient solution. Thank you all!

0 Karma
1 Solution

jtacy
Builder

Perhaps this LFTP option will do what you need:

xfer:use-temp-file (boolean)
when true, a file will be transferred to a temporary file in the same directory and then renamed.

Source: http://lftp.yar.ru/lftp-man.html

To use it, you would use this lftp command before starting the transfer:

set xfer:use-temp-file on

Good luck!

View solution in original post

jtacy
Builder

Perhaps this LFTP option will do what you need:

xfer:use-temp-file (boolean)
when true, a file will be transferred to a temporary file in the same directory and then renamed.

Source: http://lftp.yar.ru/lftp-man.html

To use it, you would use this lftp command before starting the transfer:

set xfer:use-temp-file on

Good luck!

polfer
Explorer

I'm testing this answer now - have a local build of LFTP to test this out (Ubuntu 14.04 LTS has an older version by default). At first blush it seems that mirror still starts by "Removing old file" and then retransferring. Regardless, this was precisely the option I was originally looking for, and I really appreciate the tip. Will accept the answer as soon as I can verify it eliminates the duplicates. Thank you!

0 Karma

polfer
Explorer

Looks good. Can see the temp file getting staged and see it replacing the original once complete. Without the flag, can see it removing the old file and watch it rebuild in place. Very promising! Thank you!

0 Karma

jplumsdaine22
Influencer

Just in case you didn't already think of it, any reason why you can't put a universal forwarder on the remote server?

0 Karma

polfer
Explorer

Awesome suggestion, but these are, unfortunately, logs from a 3rd party service. I cannot change the remote access protocol (stuck with ye olde FTP), the filenames, or any of the software on the machine itself. Sorry.

0 Karma
Get Updates on the Splunk Community!

Earn a $35 Gift Card for Answering our Splunk Admins & App Developer Survey

Survey for Splunk Admins and App Developers is open now! | Earn a $35 gift card!      Hello there,  Splunk ...

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

You’ve probably heard the latest about AppDynamics joining the Splunk Observability portfolio, deepening our ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

As we’ve seen, integrating Kubernetes environments with Splunk Observability Cloud is a quick and easy way to ...