Getting Data In
Highlighted

Monitoring directory loaded with LFTP Mirror causes reindexing. How should I solve this?

Explorer

I'm using Splunk 6.3.2 with a simple monitor stanza in inputs.conf that watches all the *.txt files in a particular directory.

The files in this directory are loaded using a cron job that runs every 10 minutes and uses LFTP to do a mirror from the remote server's log directory. The most recent log file grows throughout the current hour, and may be pulled several times before it stabilizes and the remote server moves to a new log file.

This sets things up for a failure, because it appears that LFTP truncates the current hour's file before repulling (it doesn't seem to pull to temp and then move to destination). We're getting duplicate log lines and re-indexing along with a WatchedFile warning that states:

File too small to check seekcrc, probably truncated.  Will re-read entire file" followed by "Will begin reading at offset=0". 

I'm hoping somebody knows of a quick trick here in either Splunk (a way to delay before doing CRC check) or LFTP (some way to force a smarter mirror or use of tmp/mv vs. truncate) that will force a more custom solution. Closest thing I've seen in Answers is folks pulling to another directory and then moving the files. I can certainly do that, but it will complicate things a bit with the desire to do a mirror (I may just have to burn 2x storage and keep two copies).

Looking forward to some been-there-done-that experience that resulted in a clean/efficient solution. Thank you all!

0 Karma
Highlighted

Re: Monitoring directory loaded with LFTP Mirror causes reindexing. How should I solve this?

Influencer

Just in case you didn't already think of it, any reason why you can't put a universal forwarder on the remote server?

0 Karma
Highlighted

Re: Monitoring directory loaded with LFTP Mirror causes reindexing. How should I solve this?

Explorer

Awesome suggestion, but these are, unfortunately, logs from a 3rd party service. I cannot change the remote access protocol (stuck with ye olde FTP), the filenames, or any of the software on the machine itself. Sorry.

0 Karma
Highlighted

Re: Monitoring directory loaded with LFTP Mirror causes reindexing. How should I solve this?

Builder

Perhaps this LFTP option will do what you need:

xfer:use-temp-file (boolean)
when true, a file will be transferred to a temporary file in the same directory and then renamed.

Source: http://lftp.yar.ru/lftp-man.html

To use it, you would use this lftp command before starting the transfer:

set xfer:use-temp-file on

Good luck!

View solution in original post

Highlighted

Re: Monitoring directory loaded with LFTP Mirror causes reindexing. How should I solve this?

Explorer

I'm testing this answer now - have a local build of LFTP to test this out (Ubuntu 14.04 LTS has an older version by default). At first blush it seems that mirror still starts by "Removing old file" and then retransferring. Regardless, this was precisely the option I was originally looking for, and I really appreciate the tip. Will accept the answer as soon as I can verify it eliminates the duplicates. Thank you!

0 Karma
Highlighted

Re: Monitoring directory loaded with LFTP Mirror causes reindexing. How should I solve this?

Explorer

Looks good. Can see the temp file getting staged and see it replacing the original once complete. Without the flag, can see it removing the old file and watch it rebuild in place. Very promising! Thank you!

0 Karma