Getting Data In

Whitelisting and wildcards at the monitor input

glpadilla_sol
Path Finder

Hello everyone,

I am trying to ingest data into Splunk and the data is into some .tgz files, but within those files are a lot of different folders and levels of directories, the thing is that I want to read just one type of file that is into those directories and is not an absolute path is a relative the path can change and can be into any directory.

So the inputs .conf was set up with something like this:

[monitor:///dir1/dir2/Spk/Test/*.tgz]

whitelist=my.log

 

But this is not working because of this: When you configure wildcards in a file input path, Splunk Enterprise creates an implicit allow list for that stanza. The longest wildcard-free path becomes the monitor stanza, and Splunk Enterprise translates the wildcards into regular expressions. 

https://docs.splunk.com/Documentation/Splunk/latest/Data/Specifyinputpathswithwildcards?_gl=1*srk1nm...

 

So I am looking the way to filter those logs using whitelisting, should I use regular expressions to filter the logs?

 

Thank you in advance.

Labels (1)
0 Karma

PickleRick
Ultra Champion

If you want to only read a limited subset of a tgz archive, I'm afraid it won't work this way.

For compressed files splunk unpacks them into a temporary directory and ingests files from that directory. I have no knowledge of any mechanism able to affect which of those unpacked files are ingested.

0 Karma

gcusello
Esteemed Legend

Hi @glpadilla_sol,

only one question: *.tgz is a part of the path or is the name of the files that you want to ingest?

if it's a part of the path, you could also try to add the filename in the monitor stanza instead of whitelist

[monitor:///dir1/dir2/Spk/Test/*.tgz/my.log]

if instead *.tgz is the name of the files to ingest, you don't need whitelist and you could use the monitor stanza as is.

If you want to read the *.tgz files in many and structured folders, you could try "..."

[monitor:///.../*.tgz]

or something similar.

Ciao.

Giuseppe

0 Karma

glpadilla_sol
Path Finder

Hi @gcusello thank you so much for the suggestions.

I am trying to ingest just a subset of files into the .tgz file, the issue is that the .tgz has a lot of files and I don't want to ingest all of them.

And I cannot defined an specific path at the monitor input because the files are at different folders.

I just want to know if there is a way to whitelist the files that I read from the .tgz.

 

Kind Regards,

0 Karma

PickleRick
Ultra Champion

As I said before, splunk unpacks the archive file and ingests all unpacked files. That's how it works. The assumption is that you have your logs ready, just packed.

The whitelist/blacklist logic works at the level of choosing which file to unpack, not which unpacked file from within the archive to ingest.

Get Updates on the Splunk Community!

Splunk Education - Fast Start Program!

Welcome to Splunk Education! Splunk training programs are designed to enable you to get started quickly and ...

Five Subtly Different Ways of Adding Manual Instrumentation in Java

You can find the code of this example on GitHub here. Please feel free to star the repository to keep in ...

New Splunk APM Enhancements Help Troubleshoot Your MySQL and NoSQL Databases Faster

Splunk Observability has two new enhancements to make it quicker and easier to troubleshoot slow or frequently ...