Getting Data In

Monitoring file is not indexing anymore - WatchedFile - File too small to check seekcrc, probably truncated

Communicator

Hi All,

We started ingesting in Splunk data generated from a custom UNIX script that runs every 5 minutes.
The output file of the script is overwritten every run, and the dimension is around 20KB.

In Splunk we have set-up to monitor the output file, and for approximately half a day everything worked properly.
The day after we have seen nothing was indexing anymore, so we have checked in the _internal index and in splunkd.log and found the following information:

03-02-2018 10:03:35.168 +0000 INFO  TailingProcessor - Parsing configuration stanza: monitor:///test/test_file.fwd.
03-02-2018 10:05:33.995 +0000 INFO  WatchedFile - Will begin reading at offset=0 for file='/test/test_file.fwd'.
03-02-2018 10:10:12.045 +0000 INFO  WatchedFile - File too small to check seekcrc, probably truncated.  Will re-read entire file='/test/test_file.fwd'.
03-02-2018 10:15:11.472 +0000 INFO  WatchedFile - File too small to check seekcrc, probably truncated.  Will re-read entire file='/test/test_file.fwd'.
03-02-2018 10:20:11.968 +0000 INFO  WatchedFile - File too small to check seekcrc, probably truncated.  Will re-read entire file='/test/test_file.fwd'.

Do you know why Splunk is not indexing anymore that file?

Thanks a lot,
Edoardo

0 Karma
1 Solution

Communicator

Ciao All,

We realized that the issue was coming because Splunk was not able to correctly read the timestamp in the file, sometimes it was interpreting them as dd/mm/yyyy and sometimes as mm/dd/yyyy so for this reason it was ingesting the data but collecting them in the past (so searching them in the last 24 hours they were not appearing). Suppose the data was referring to 1st April 2018, at the time of data ingestion it was saving it as 4th January 2018. Due to that we have reviewed in the Settings >> Source Type the timestamp format to help Splunk recognize timestamps, here below the reference documentation:

https://docs.splunk.com/Documentation/SplunkCloud/7.0.0/Data/Configuretimestamprecognition#Enhanced_...

Hope this can help.

Best Regards,
Edoardo

View solution in original post

Communicator

Ciao All,

We realized that the issue was coming because Splunk was not able to correctly read the timestamp in the file, sometimes it was interpreting them as dd/mm/yyyy and sometimes as mm/dd/yyyy so for this reason it was ingesting the data but collecting them in the past (so searching them in the last 24 hours they were not appearing). Suppose the data was referring to 1st April 2018, at the time of data ingestion it was saving it as 4th January 2018. Due to that we have reviewed in the Settings >> Source Type the timestamp format to help Splunk recognize timestamps, here below the reference documentation:

https://docs.splunk.com/Documentation/SplunkCloud/7.0.0/Data/Configuretimestamprecognition#Enhanced_...

Hope this can help.

Best Regards,
Edoardo

View solution in original post

Path Finder

awesome,  thanks so much for this reply (and that you came back and posted the post-fix'd solution).

Was having this exact same issue with splunk UF monitoring some log files (and to debug i was searching the last 4 hours on my indexer to see when/if  i had fixed inputs.conf on the uf .  was also monitoring splunkd.log on the uf -  but I couldnt fix it!) 

This was the issue,  splunk UF couldn't read the timestamp of the log files it was monitoring properly, so the files were being sent but in the past!.

(so my fix, was the same as yours,  in that i made a custom sourcetype with timestamp="current time" on the main splunk indexer (web gui), and then on the UF input.conf set the stanza for monitor://c:/blah/file.log.* to use that custom sourcetype)

thanks again! 

Motivator

Thanks. I have resolved the issue by adding below config in props.conf in indexers (followed by indexer restart) to parse the timestamp properly in the log file (dns.log).

DATETIME_CONFIG =
TIME_FORMAT = %d/%m/%Y %I:%M:%S %p
TIME_PREFIX = ^

Ultra Champion

-- The output file of the script is overwritten every run, ...
Not the best practice - is it possible to append to the file instead of overwriting it?

0 Karma

Communicator

Ciao,

Thanks for your answer.
Currently we are generating different files every run appending in the name of the file a timestamp.
P.S. We realized which was the issue, see my answer.

Best Regards,
Edoardo

0 Karma

Contributor

seekcrc needs by default 256 Bytes and your file is only 20KB so that should be fine.

I recommend to use crcSalt = in inputs.conf
Be careful to no ingest your file twice.

If it does not work, please describe how you write to file and you do log rotation

0 Karma

Explorer

I just met this issue and solved it by add crcSalt = <SOURCE> to inputs.conf in forwarder server. Thank you very much.

0 Karma