Getting Data In

Issue with file csv monitoring

uagraw01
Motivator

Dear Splunkers!!

I am facing an issue with Splunk file monitoring configuration. When I define the complete absolute path in the inputs.conf file, Splunk successfully monitors the files. Below are two examples of working stanza configurations:

Working Configurations:

[monitor://E:\var\log\Bapto\BaptoEventsLog\SZC\0000000002783979-2025-03-27T07-39-33-128Z-SZC.VIT.BaptoEvents.50301.csv]

[monitor://E:\var\log\Bapto\BaptoEventsLog\SZC\0000000002783446-2025-03-27T05-09-20-566Z-SZC.VIT.BaptoEvents.50296.csv]
However, since more than 200 files are generated, specifying absolute paths for each file is not feasible. To automate this, I attempted to use a wildcard pattern in the stanza, as shown below:

Non-Working Configuration:

[monitor://E:\var\log\Bapto\BaptoEventsLog\SZC\*.csv]

Unfortunately, this approach does not ingest any files into Splunk. I would appreciate your guidance on resolving this issue.

Looking forward to your insights.

Labels (1)
0 Karma
1 Solution

livehybrid
Super Champion

Hi @uagraw01 

Some interesting info at https://docs.splunk.com/Documentation/Splunk/latest/data/Specifyinputpathswithwildcards if you havent already seen it.

On Windows, if you specify the [monitor://C:\Windows\foo\bar*.log] stanza in the inputs.conf file, Splunk Enterprise translates the path into this:

[monitor://C:\Windows\foo\]
whitelist = bar[^\\]*\.log$
In Windows, allow list and deny list rules don't support regular expressions that include backslashes. Use two backslashes (\\) to escape wildcards.

This means

[monitor://E:\var\log\Bapto\BaptoEventsLog\SZC\*.csv]

becomes

[monitor://E:\var\log\Bapto\BaptoEventsLog\SZC\]
whitelist = [^\\]*\.csv$

Im wondering if this whitelist is being overwritten somehow, have you specified any whitelist?

It might be worth trying the following input to see if this works, basically explicitly setting the whitelist to what its expecting.

[monitor://E:\var\log\Bapto\BaptoEventsLog\SZC\]
whitelist = [^\\]*\.csv$

Please let me know how you get on and consider adding karma to this or any other answer if it has helped.
Regards

Will

 

View solution in original post

livehybrid
Super Champion

Hi @uagraw01 

Some interesting info at https://docs.splunk.com/Documentation/Splunk/latest/data/Specifyinputpathswithwildcards if you havent already seen it.

On Windows, if you specify the [monitor://C:\Windows\foo\bar*.log] stanza in the inputs.conf file, Splunk Enterprise translates the path into this:

[monitor://C:\Windows\foo\]
whitelist = bar[^\\]*\.log$
In Windows, allow list and deny list rules don't support regular expressions that include backslashes. Use two backslashes (\\) to escape wildcards.

This means

[monitor://E:\var\log\Bapto\BaptoEventsLog\SZC\*.csv]

becomes

[monitor://E:\var\log\Bapto\BaptoEventsLog\SZC\]
whitelist = [^\\]*\.csv$

Im wondering if this whitelist is being overwritten somehow, have you specified any whitelist?

It might be worth trying the following input to see if this works, basically explicitly setting the whitelist to what its expecting.

[monitor://E:\var\log\Bapto\BaptoEventsLog\SZC\]
whitelist = [^\\]*\.csv$

Please let me know how you get on and consider adding karma to this or any other answer if it has helped.
Regards

Will

 

PickleRick
SplunkTrust
SplunkTrust
splunk list inputstatus

splunk list monitor

What do these two have to say?

Since you're ingesting csv files which have fixed headers there's a fat chance crcs match and files are  ot ingested because are treated as already seen. Might want to increase initCrcLength (or fiddle with crcSalt but that's the last resort).

uagraw01
Motivator


@PickleRick  I have executed the command but nothing is visible relevant to my required starnza.

uagraw01_0-1743075189033.png

FYI to you my current inputs setting.

[monitor://E:\var\log\Bapto\BaptoEventsLog\SZC\000000000*-*-SZC.VIT.BaptoEvents.*]
whitelist = \.csv$
disabled = false
index = Bapto
initCrcLength = 256
sourcetype = SZC_BaptoEvent

props.conf:

[SZC_BaptoEvent]
SHOULD_LINEMERGE = false
#CHARSET = ISO-8859-1
TIME_FORMAT = %Y-%m-%d %H:%M:%S.%3N
MAX_TIMESTAMP_LOOKAHEAD = 23
TRANSFORMS-drop_header = remove_csv_header
TZ = UTC

transforms.conf

[remove_csv_header]
REGEX = ^Timestamp;AlarmId;SenderType;SenderId;Severity;CreationTime;ComplexEventType;ExtraInfo
DEST_KEY = queue
FORMAT = nullQueue

Sample of csv files to be monitor:

Timestamp;AlarmId;SenderType;SenderId;Severity;CreationTime;ComplexEventType;ExtraInfo
2025-03-27 12:40:12.152;1526;Mpg;Shuttle_115;Information;2025-03-27 12:40:12.152;TetrisPlanningDelay;TetrisId: TetrisReservation_16_260544_bqixLeVr,ShuttleId: Shuttle_115,FirstDelaySection: A24.16,FirstSection: A8.16,LastSection: A24.16
2025-03-27 12:40:12.152;1526;Mpg;Shuttle_115;Unknown;2025-03-27 12:40:12.152;TetrisPlanningDelay;
2025-03-27 12:40:14.074;0;Shuttle;Shuttle_027;Unknown;2025-03-27 12:40:14.074;NoError;
2025-03-27 12:40:16.056;0;Shuttle;Shuttle_051;Unknown;2025-03-27 12:40:16.056;NoError;
2025-03-27 12:40:30.076;0;Shuttle;Shuttle_119;Unknown;2025-03-27 12:40:30.076;NoError;

0 Karma

PickleRick
SplunkTrust
SplunkTrust

OK. You should have entries higher up regarding your wildcarded entries.

They will be shown under Monitored directories.

And inputstatus should show you the files with their status (where the input is or why are not ingested).

On linux you might just do | grep -C 10 BaptoEvents to limit the output dump only to relevant entries but since you're on windows, you have to use your PS-fu or cmd-fu.

0 Karma

vsommer
Explorer

Hi @uagraw01,

you can also change your stanza to this:

[monitor://E:\var\log\Bapto\BaptoEventsLog\SZC\]
whitelist = \.csv$

 

Hope this helps you.

uagraw01
Motivator

Hi @vsommer I have tried your suggested one but still no luck found.

[monitor://E:\var\log\Bapto\BaptoEventsLog\SZC\]
whitelist = \.csv$

 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @uagraw01 ,

what about using only* instead *,csv?

then, did you tried with whitelist option instead of inserting the file in the input stanza?

Ciao.

Giuseppe

uagraw01
Motivator

@gcusello Yes, I have tries but nothing works.

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In April, the Splunk Threat Research Team had 2 releases of new security content via the Enterprise Security ...

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

What are Community Office Hours?Community Office Hours is an interactive 60-minute Zoom series where ...

It’s go time — Boston, here we come!

Are you ready to take your Splunk skills to the next level? Get set, because Splunk University is back, and ...