Getting Data In

Whitelisting and wildcards at the monitor input

glpadilla_sol
Path Finder

Hello everyone,

I am trying to ingest data into Splunk and the data is into some .tgz files, but within those files are a lot of different folders and levels of directories, the thing is that I want to read just one type of file that is into those directories and is not an absolute path is a relative the path can change and can be into any directory.

So the inputs .conf was set up with something like this:

[monitor:///dir1/dir2/Spk/Test/*.tgz]

whitelist=my.log

 

But this is not working because of this: When you configure wildcards in a file input path, Splunk Enterprise creates an implicit allow list for that stanza. The longest wildcard-free path becomes the monitor stanza, and Splunk Enterprise translates the wildcards into regular expressions. 

https://docs.splunk.com/Documentation/Splunk/latest/Data/Specifyinputpathswithwildcards?_gl=1*srk1nm...

 

So I am looking the way to filter those logs using whitelisting, should I use regular expressions to filter the logs?

 

Thank you in advance.

Labels (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

If you want to only read a limited subset of a tgz archive, I'm afraid it won't work this way.

For compressed files splunk unpacks them into a temporary directory and ingests files from that directory. I have no knowledge of any mechanism able to affect which of those unpacked files are ingested.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @glpadilla_sol,

only one question: *.tgz is a part of the path or is the name of the files that you want to ingest?

if it's a part of the path, you could also try to add the filename in the monitor stanza instead of whitelist

[monitor:///dir1/dir2/Spk/Test/*.tgz/my.log]

if instead *.tgz is the name of the files to ingest, you don't need whitelist and you could use the monitor stanza as is.

If you want to read the *.tgz files in many and structured folders, you could try "..."

[monitor:///.../*.tgz]

or something similar.

Ciao.

Giuseppe

0 Karma

glpadilla_sol
Path Finder

Hi @gcusello thank you so much for the suggestions.

I am trying to ingest just a subset of files into the .tgz file, the issue is that the .tgz has a lot of files and I don't want to ingest all of them.

And I cannot defined an specific path at the monitor input because the files are at different folders.

I just want to know if there is a way to whitelist the files that I read from the .tgz.

 

Kind Regards,

0 Karma

PickleRick
SplunkTrust
SplunkTrust

As I said before, splunk unpacks the archive file and ingests all unpacked files. That's how it works. The assumption is that you have your logs ready, just packed.

The whitelist/blacklist logic works at the level of choosing which file to unpack, not which unpacked file from within the archive to ingest.

Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...