Getting Data In

Monitoring of files

Stives
Explorer

Hello, I´m trying to resolve monitoring issue of available .csv files of specific directory. There are several files marked by different date e.g. 2023-11-16_filename.csv or 2023-11-20_filename.csv.
None of them has the same date at the beginning for this reason. I´m able synch with the server most of the files but there are some which I´m not. For example my indexing started on 02.10.23 and all the files matching or later are available as source. But all the files before this date are not e.g. 2023-09-15_filename.csv.
What could cause this performance and is there a way how to push files to available as a source even they marked with the date before 02.10.2023 ? Thanks

Labels (1)
0 Karma

TheEggi98
Path Finder

Hello @Stives ,
How does your Inputstanza looks like?
If no crcSalt is specified in the stanza, Splunk will look into the first (i think 256) Bytes of a file and determines based on that if it already know the File.
If the first Bytes in the CSV files will always be the same you could change your inputstanza and add 

 

 

crcSalt = <SOURCE>

 

 

docs to monitoring stanza for a deeper look into crcSalt: 
https://docs.splunk.com/Documentation/Splunk/9.1.2/Admin/Inputsconf#MONITOR:

But be cautious, this will tell splunk to watch for the full path to determine if this file is already been indexed, so there is a possibility that you index the same file twice. Especially for Directories with rolling logfiles.

Other possibility could be that the dates are out of the retention time scope. (If the files got indexed once but due to retention time got removed again when its bucket is not hot anymore)

0 Karma

PickleRick
SplunkTrust
SplunkTrust

crcSalt is actually very rarely the proper option to set. It's often better to raise the initCrcLength to a higher value in case the file has a pretty constant header.

0 Karma

Stives
Explorer

Hello I see. You mean anything like this ?

initCrcLength = <256>

 

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Close. But without the <> part (the <SOURCE> part must be literally put this way if you use this option). And you'd typically want a higher value if you have a constant header.

Something like

initCrcLength = 1024

for example.

0 Karma

Stives
Explorer

One more. I was checking and one of the files has more than  124 000 bytes. What value I should define for initCrcLenght ?   

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Hi

Are you sure that you haven't set this?

ignoreOlderThan

 Can you post your inputs.conf for this source, so we can check if there is something else which can cause this behaviour?

r. Ismo

0 Karma

Stives
Explorer

Hello Ismo,

inputs.conf definition looks like this:

[monitor:///home/sicpa_operator/deploy/PROD/machine/monitoring/*production_statistics.csv]
index = sts
disabled = false
sourcetype = STSLOGMPPS
crcSalt = <SOURCE>

by *production_statistics.csv I make sure all the files have to be synced they only contain different dates at the beginning of each file name. Seems I´m able sync only the files by the deployment date. Means files from date when UF been deployed are synced but the everything before not.
BR

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Thanks. 

How about outputs of 

splunk list inputstatus

as @PickleRick asked? That command shows what files it has read and how much has managed.

Also you could try 

splunk btool inputs list monitor:///home/sicpa_operator/deploy/PROD/machine/monitoring/ --debug

to see if there is somewhere defined some weird defaults for your inputs. 

0 Karma

Stives
Explorer

Thanks for reply. I´ve tried with the option: 

initCrcLength = 1024

But still not all the files have been synced. There are still more pending.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

You can check the status of your inputs with

splunk list monitor

and

splunk list inputstatus
0 Karma

Stives
Explorer

Hello, thanks for reply. 

crcSalt = <SOURCE>

I´ve been adding crcSalt into my stanza but still the not all the files have been synced either. 

Tags (1)
0 Karma
Get Updates on the Splunk Community!

Splunk Observability Cloud's AI Assistant in Action Series: Auditing Compliance and ...

This is the third post in the Splunk Observability Cloud’s AI Assistant in Action series that digs into how to ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

What You Read The Most: Splunk Lantern’s Most Popular Articles!

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...