Getting Data In

Whitelisting and wildcards at the monitor input

glpadilla_sol
Path Finder

Hello everyone,

I am trying to ingest data into Splunk and the data is into some .tgz files, but within those files are a lot of different folders and levels of directories, the thing is that I want to read just one type of file that is into those directories and is not an absolute path is a relative the path can change and can be into any directory.

So the inputs .conf was set up with something like this:

[monitor:///dir1/dir2/Spk/Test/*.tgz]

whitelist=my.log

 

But this is not working because of this: When you configure wildcards in a file input path, Splunk Enterprise creates an implicit allow list for that stanza. The longest wildcard-free path becomes the monitor stanza, and Splunk Enterprise translates the wildcards into regular expressions. 

https://docs.splunk.com/Documentation/Splunk/latest/Data/Specifyinputpathswithwildcards?_gl=1*srk1nm...

 

So I am looking the way to filter those logs using whitelisting, should I use regular expressions to filter the logs?

 

Thank you in advance.

Labels (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

If you want to only read a limited subset of a tgz archive, I'm afraid it won't work this way.

For compressed files splunk unpacks them into a temporary directory and ingests files from that directory. I have no knowledge of any mechanism able to affect which of those unpacked files are ingested.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @glpadilla_sol,

only one question: *.tgz is a part of the path or is the name of the files that you want to ingest?

if it's a part of the path, you could also try to add the filename in the monitor stanza instead of whitelist

[monitor:///dir1/dir2/Spk/Test/*.tgz/my.log]

if instead *.tgz is the name of the files to ingest, you don't need whitelist and you could use the monitor stanza as is.

If you want to read the *.tgz files in many and structured folders, you could try "..."

[monitor:///.../*.tgz]

or something similar.

Ciao.

Giuseppe

0 Karma

glpadilla_sol
Path Finder

Hi @gcusello thank you so much for the suggestions.

I am trying to ingest just a subset of files into the .tgz file, the issue is that the .tgz has a lot of files and I don't want to ingest all of them.

And I cannot defined an specific path at the monitor input because the files are at different folders.

I just want to know if there is a way to whitelist the files that I read from the .tgz.

 

Kind Regards,

0 Karma

PickleRick
SplunkTrust
SplunkTrust

As I said before, splunk unpacks the archive file and ingests all unpacked files. That's how it works. The assumption is that you have your logs ready, just packed.

The whitelist/blacklist logic works at the level of choosing which file to unpack, not which unpacked file from within the archive to ingest.

Get Updates on the Splunk Community!

Stay Connected: Your Guide to November Tech Talks, Office Hours, and Webinars!

🍂 Fall into November with a fresh lineup of Community Office Hours, Tech Talks, and Webinars we’ve ...

Transform your security operations with Splunk Enterprise Security

Hi Splunk Community, Splunk Platform has set a great foundation for your security operations. With the ...

Splunk Admins and App Developers | Earn a $35 gift card!

Splunk, in collaboration with ESG (Enterprise Strategy Group) by TechTarget, is excited to announce a ...