Hi all,
I have a batch job that monitors my infrastructure health (basically doing "resource cluster" to check for resource statuses). The batch job is performed at 10 minute intervals.
The output is the same when there are no issues with my infrastructure. However, Splunk does not index the file because the contents are the same.
I have tried to add the following lines into my props.conf (on the server where my forwarder is installed) under \$SPLUNKHOME\etc\system\local
[source::D:\Program Files\Splunk.....\scripts\text.txt]
CHECK_METHOD = entire_md5
My \$SPLUNK_HOME\etc\apps\cluster\local\inputs.conf
has a [monitor://D:\Program Files\Splunk.....\scripts\text.txt]
inside with the correct settings such as sourcetype.
I can monitor the file but I'm unable to let Splunk index the file that has identical contents but different timestamp.
Am I missing something anywhere?
Add the following to the monitor stanza in \$SPLUNK_HOME\etc\apps\cluster\local\inputs.conf
crcSalt = <SOURCE>
Normally, Splunk compares the first few lines of the file to determine if it has already indexed the file. As you noticed, Splunk does not want to index files if their contents are the same. The crcSalt
attribute above adds the full path of the source file to this equation - so if two files have the same initial contents but different names - Splunk will still index the file.
Now, your script needs to generate a unique file name for each run. This can be easily done by adding the timestamp to the file name in the script. If you want the data to be indexed with the same source name, you can set source=text.txt
in inputs.conf to override the default source.
AFAIK, there is no way for Splunk to access the timestamp as a way of discriminating between files on input.
Add the following to the monitor stanza in \$SPLUNK_HOME\etc\apps\cluster\local\inputs.conf
crcSalt = <SOURCE>
Normally, Splunk compares the first few lines of the file to determine if it has already indexed the file. As you noticed, Splunk does not want to index files if their contents are the same. The crcSalt
attribute above adds the full path of the source file to this equation - so if two files have the same initial contents but different names - Splunk will still index the file.
Now, your script needs to generate a unique file name for each run. This can be easily done by adding the timestamp to the file name in the script. If you want the data to be indexed with the same source name, you can set source=text.txt
in inputs.conf to override the default source.
AFAIK, there is no way for Splunk to access the timestamp as a way of discriminating between files on input.
Hi Iguinn,
Yeah thanks your method works.
What I did was the following, according to what you have mentioned:
However, from the props.conf documentation:
CHECK_METHOD = [endpoint_md5|entire_md5|modtime]
* Set CHECK_METHOD endpoint_md5 to have Splunk checksum of the first and last 256 bytes of a
file. When it finds matches, Splunk lists the file as already indexed and indexes only new
data, or ignores it if there is no new data.
* Set CHECK_METHOD = entire_md5 to use the checksum of the entire file.
* Set CHECK_METHOD = modtime to check only the modification time of the file.
Shouldn't adding CHECK_METHOD = modtime suffice for my case instead of needing to add crcSalt and change the filename with timestamp?
Regards,
Leon
I haven't ever tried that option, but yes, I think that CHECK_METHOD = modtime
should work in your case.