Getting Data In

Whitelisting and wildcards at the monitor input

glpadilla_sol
Path Finder

Hello everyone,

I am trying to ingest data into Splunk and the data is into some .tgz files, but within those files are a lot of different folders and levels of directories, the thing is that I want to read just one type of file that is into those directories and is not an absolute path is a relative the path can change and can be into any directory.

So the inputs .conf was set up with something like this:

[monitor:///dir1/dir2/Spk/Test/*.tgz]

whitelist=my.log

 

But this is not working because of this: When you configure wildcards in a file input path, Splunk Enterprise creates an implicit allow list for that stanza. The longest wildcard-free path becomes the monitor stanza, and Splunk Enterprise translates the wildcards into regular expressions. 

https://docs.splunk.com/Documentation/Splunk/latest/Data/Specifyinputpathswithwildcards?_gl=1*srk1nm...

 

So I am looking the way to filter those logs using whitelisting, should I use regular expressions to filter the logs?

 

Thank you in advance.

Labels (1)
0 Karma

PickleRick
Ultra Champion

If you want to only read a limited subset of a tgz archive, I'm afraid it won't work this way.

For compressed files splunk unpacks them into a temporary directory and ingests files from that directory. I have no knowledge of any mechanism able to affect which of those unpacked files are ingested.

0 Karma

gcusello
Legend

Hi @glpadilla_sol,

only one question: *.tgz is a part of the path or is the name of the files that you want to ingest?

if it's a part of the path, you could also try to add the filename in the monitor stanza instead of whitelist

[monitor:///dir1/dir2/Spk/Test/*.tgz/my.log]

if instead *.tgz is the name of the files to ingest, you don't need whitelist and you could use the monitor stanza as is.

If you want to read the *.tgz files in many and structured folders, you could try "..."

[monitor:///.../*.tgz]

or something similar.

Ciao.

Giuseppe

0 Karma

glpadilla_sol
Path Finder

Hi @gcusello thank you so much for the suggestions.

I am trying to ingest just a subset of files into the .tgz file, the issue is that the .tgz has a lot of files and I don't want to ingest all of them.

And I cannot defined an specific path at the monitor input because the files are at different folders.

I just want to know if there is a way to whitelist the files that I read from the .tgz.

 

Kind Regards,

0 Karma

PickleRick
Ultra Champion

As I said before, splunk unpacks the archive file and ingests all unpacked files. That's how it works. The assumption is that you have your logs ready, just packed.

The whitelist/blacklist logic works at the level of choosing which file to unpack, not which unpacked file from within the archive to ingest.

Get Updates on the Splunk Community!

Splunk Forwarders and Forced Time Based Load Balancing

Splunk customers use universal forwarders to collect and send data to Splunk. A universal forwarder can send ...

NEW! Log Views in Splunk Observability Dashboards Gives Context From a Single Page

Today, Splunk Observability releases log views, a new feature for users to add their logs data from Splunk Log ...

Last Chance to Submit Your Paper For BSides Splunk - Deadline is August 12th!

Hello everyone! Don't wait to submit - The deadline is August 12th! We have truly missed the community so ...