Hello everyone,
I am trying to ingest data into Splunk and the data is into some .tgz files, but within those files are a lot of different folders and levels of directories, the thing is that I want to read just one type of file that is into those directories and is not an absolute path is a relative the path can change and can be into any directory.
So the inputs .conf was set up with something like this:
[monitor:///dir1/dir2/Spk/Test/*.tgz]
whitelist=my.log
But this is not working because of this: When you configure wildcards in a file input path, Splunk Enterprise creates an implicit allow list for that stanza. The longest wildcard-free path becomes the monitor stanza, and Splunk Enterprise translates the wildcards into regular expressions.
So I am looking the way to filter those logs using whitelisting, should I use regular expressions to filter the logs?
Thank you in advance.
If you want to only read a limited subset of a tgz archive, I'm afraid it won't work this way.
For compressed files splunk unpacks them into a temporary directory and ingests files from that directory. I have no knowledge of any mechanism able to affect which of those unpacked files are ingested.
Hi @glpadilla_sol,
only one question: *.tgz is a part of the path or is the name of the files that you want to ingest?
if it's a part of the path, you could also try to add the filename in the monitor stanza instead of whitelist
[monitor:///dir1/dir2/Spk/Test/*.tgz/my.log]
if instead *.tgz is the name of the files to ingest, you don't need whitelist and you could use the monitor stanza as is.
If you want to read the *.tgz files in many and structured folders, you could try "..."
[monitor:///.../*.tgz]
or something similar.
Ciao.
Giuseppe
Hi @gcusello thank you so much for the suggestions.
I am trying to ingest just a subset of files into the .tgz file, the issue is that the .tgz has a lot of files and I don't want to ingest all of them.
And I cannot defined an specific path at the monitor input because the files are at different folders.
I just want to know if there is a way to whitelist the files that I read from the .tgz.
Kind Regards,
As I said before, splunk unpacks the archive file and ingests all unpacked files. That's how it works. The assumption is that you have your logs ready, just packed.
The whitelist/blacklist logic works at the level of choosing which file to unpack, not which unpacked file from within the archive to ingest.