Version 4.0.11
I have a number of .CSV files in my log folder on a light forwarder. Unfortunately at least one of them has not been indexed/forwarded since September 1st. Now that I am about to do a major demo I discover this fact.
How do I get splunk to forward/index this file? It changes every morning. There is no header on the file. Today's file looks like this:
AE-327RA-MIB-000.pdf A1-L18AC-747-200.pdf AE-325RA-MIB-000.pdf A1-E6AAB-MRC-100.pdf A1-AM2BB-SRM-200.pdf 17-600-220-6-2.pdf 17-15R-1.pdf 17-600-220-6-1.pdf 16-35ON655-1.pdf 01-75GAJ-23FI-20-1.pdf 13-1-6-3-1.pdf 01-75GAJ-6.pdf 01-75GAJ-4-79-1.pdf
This list of files is passed into a form where the user can select the file and see when it was downloaded. Since it is not being indexed the user has noting to select from.
By default, Splunk is going to look at the CRC values of the beginning and end of the file. If your file does not change much, this may not be enough.
Assuming the filename is always the same, the simplest solution is to look solely at the timestamp instead. In props.conf:
[source:///path/to/yoursourcefile]
CHECK_METHOD = modtime
This will cause Splunk to re-index the file every time the modification timestamp changes, rather than looking at the content.
If the filename changes, the crcSalt
option in inputs.conf
is another possibility.
You can also look at this link if you want to dig deeper into the root cause - there are some debugging options that can be turned on to provide more information:
http://www.splunk.com/wiki/Community:Troubleshooting_Monitor_Inputs
It still didn't work for all the files. In checking the software that creates the files I found a bug that prevented all but one file from being updated. Patched the software and now it works just fine. I still like setting the CHECK_METHOD = modtime just to insure the files will be indexed.
It still didn't work for all the files. In checking the software that creates the files I found a bug that prevented all but one file from being updated. Patched the software and now it works just fine. I still like setting the CHECK_METHOD = modtime just to insure the files will be indexed.
By default, Splunk is going to look at the CRC values of the beginning and end of the file. If your file does not change much, this may not be enough.
Assuming the filename is always the same, the simplest solution is to look solely at the timestamp instead. In props.conf:
[source:///path/to/yoursourcefile]
CHECK_METHOD = modtime
This will cause Splunk to re-index the file every time the modification timestamp changes, rather than looking at the content.
If the filename changes, the crcSalt
option in inputs.conf
is another possibility.
You can also look at this link if you want to dig deeper into the root cause - there are some debugging options that can be turned on to provide more information:
http://www.splunk.com/wiki/Community:Troubleshooting_Monitor_Inputs
Thanks. That was just what I needed!