Getting Data In

File has not been indexed since 9/1/2010

Builder

Version 4.0.11

I have a number of .CSV files in my log folder on a light forwarder. Unfortunately at least one of them has not been indexed/forwarded since September 1st. Now that I am about to do a major demo I discover this fact.

How do I get splunk to forward/index this file? It changes every morning. There is no header on the file. Today's file looks like this:

AE-327RA-MIB-000.pdf
A1-L18AC-747-200.pdf
AE-325RA-MIB-000.pdf
A1-E6AAB-MRC-100.pdf
A1-AM2BB-SRM-200.pdf
17-600-220-6-2.pdf
17-15R-1.pdf
17-600-220-6-1.pdf
16-35ON655-1.pdf
01-75GAJ-23FI-20-1.pdf
13-1-6-3-1.pdf
01-75GAJ-6.pdf
01-75GAJ-4-79-1.pdf

This list of files is passed into a form where the user can select the file and see when it was downloaded. Since it is not being indexed the user has noting to select from.

Tags (1)
0 Karma
2 Solutions

Motivator

By default, Splunk is going to look at the CRC values of the beginning and end of the file. If your file does not change much, this may not be enough.

Assuming the filename is always the same, the simplest solution is to look solely at the timestamp instead. In props.conf:

[source:///path/to/yoursourcefile]
CHECK_METHOD = modtime

This will cause Splunk to re-index the file every time the modification timestamp changes, rather than looking at the content.

If the filename changes, the crcSalt option in inputs.conf is another possibility.

You can also look at this link if you want to dig deeper into the root cause - there are some debugging options that can be turned on to provide more information:
     http://www.splunk.com/wiki/Community:Troubleshooting_Monitor_Inputs

View solution in original post

Builder

It still didn't work for all the files. In checking the software that creates the files I found a bug that prevented all but one file from being updated. Patched the software and now it works just fine. I still like setting the CHECK_METHOD = modtime just to insure the files will be indexed.

View solution in original post

0 Karma

Builder

It still didn't work for all the files. In checking the software that creates the files I found a bug that prevented all but one file from being updated. Patched the software and now it works just fine. I still like setting the CHECK_METHOD = modtime just to insure the files will be indexed.

View solution in original post

0 Karma

Motivator

By default, Splunk is going to look at the CRC values of the beginning and end of the file. If your file does not change much, this may not be enough.

Assuming the filename is always the same, the simplest solution is to look solely at the timestamp instead. In props.conf:

[source:///path/to/yoursourcefile]
CHECK_METHOD = modtime

This will cause Splunk to re-index the file every time the modification timestamp changes, rather than looking at the content.

If the filename changes, the crcSalt option in inputs.conf is another possibility.

You can also look at this link if you want to dig deeper into the root cause - there are some debugging options that can be turned on to provide more information:
     http://www.splunk.com/wiki/Community:Troubleshooting_Monitor_Inputs

View solution in original post

Builder

Thanks. That was just what I needed!

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!