I currently write status files and collect them into Splunk using monitor statements and Universal Forwarders. Pretty vanilla stuff. Well, we are looking to rewrite our status scripts and I've been asked if we can just rewrite the status information into the same file every 3 minutes and save it. The file size would not just(much), just the contents would change. My first reaction was a faint yet. I thought forwarder would notice a CRC change and reindex from the beginning. This doesn't seem to be happening. If I restart the forwarder is throws and error and reindexes, and it seems once in a while it will catch a change, but not dependably. I assumed placing a timestamp at the head of the status information would assure a CRC change, but sadly we are not getting the desired results.
Ignoring the advisability of this method, should it work? Is there a timeout or polling interval which applies here, at which time the forwarder will recheck CRC?
Log from Forwarder on restart
2/19/15
8:29:37.599 AM
02-19-2015 09:29:37.599 -0600 INFO WatchedFile - Checksum for seekptr didn't match, will re-read entire file='D:\SplunkData\Test.txt
Here is the contents of my test file. The only required change every 3 minutes is the time stamp, the rest of the text could be identical.
[02/19/15 10:25:21 -0600] Node(xxxxxxxx) DataCenter(YYYY) Volume(/aaaa/bbbb) Size(32768000) Time(0.13843) Throughput(236.712) NewField(two)
Option 1:
Put a timestamp in the filename:
filename-01-19-2015-13-03-00.txt
Be sure to script a clean up of the directory every week or so.
Option 2:
Use a scripted input. Have the script read out the file into splunk, then delete the file from the file system.
Scripted inputs are covered here:
http://docs.splunk.com/Documentation/Splunk/6.2.1/AdvancedDev/ScriptedInputsIntro