Hello, I´m trying to resolve monitoring issue of available .csv files of specific directory. There are several files marked by different date e.g. 2023-11-16_filename.csv or 2023-11-20_filename.csv.
None of them has the same date at the beginning for this reason. I´m able synch with the server most of the files but there are some which I´m not. For example my indexing started on 02.10.23 and all the files matching or later are available as source. But all the files before this date are not e.g. 2023-09-15_filename.csv.
What could cause this performance and is there a way how to push files to available as a source even they marked with the date before 02.10.2023 ? Thanks
Hello @Stives ,
How does your Inputstanza looks like?
If no crcSalt is specified in the stanza, Splunk will look into the first (i think 256) Bytes of a file and determines based on that if it already know the File.
If the first Bytes in the CSV files will always be the same you could change your inputstanza and add
crcSalt = <SOURCE>
docs to monitoring stanza for a deeper look into crcSalt:
https://docs.splunk.com/Documentation/Splunk/9.1.2/Admin/Inputsconf#MONITOR:
But be cautious, this will tell splunk to watch for the full path to determine if this file is already been indexed, so there is a possibility that you index the same file twice. Especially for Directories with rolling logfiles.
Other possibility could be that the dates are out of the retention time scope. (If the files got indexed once but due to retention time got removed again when its bucket is not hot anymore)
crcSalt is actually very rarely the proper option to set. It's often better to raise the initCrcLength to a higher value in case the file has a pretty constant header.
Hello I see. You mean anything like this ?
initCrcLength = <256>
Close. But without the <> part (the <SOURCE> part must be literally put this way if you use this option). And you'd typically want a higher value if you have a constant header.
Something like
initCrcLength = 1024
for example.
One more. I was checking and one of the files has more than 124 000 bytes. What value I should define for initCrcLenght ?
Hi
Are you sure that you haven't set this?
ignoreOlderThan
Can you post your inputs.conf for this source, so we can check if there is something else which can cause this behaviour?
r. Ismo
Hello Ismo,
inputs.conf definition looks like this:
[monitor:///home/sicpa_operator/deploy/PROD/machine/monitoring/*production_statistics.csv]
index = sts
disabled = false
sourcetype = STSLOGMPPS
crcSalt = <SOURCE>
by *production_statistics.csv I make sure all the files have to be synced they only contain different dates at the beginning of each file name. Seems I´m able sync only the files by the deployment date. Means files from date when UF been deployed are synced but the everything before not.
BR
Thanks.
How about outputs of
splunk list inputstatus
as @PickleRick asked? That command shows what files it has read and how much has managed.
Also you could try
splunk btool inputs list monitor:///home/sicpa_operator/deploy/PROD/machine/monitoring/ --debug
to see if there is somewhere defined some weird defaults for your inputs.
Thanks for reply. I´ve tried with the option:
initCrcLength = 1024
But still not all the files have been synced. There are still more pending.
You can check the status of your inputs with
splunk list monitor
and
splunk list inputstatus
Hello, thanks for reply.
crcSalt = <SOURCE>
I´ve been adding crcSalt into my stanza but still the not all the files have been synced either.