Getting Data In

How to Drop All Logs Under a Specific Directory and Its Subdirectories Using props.conf & transforms.conf on a Heavy For

ParsaIsHash
Loves-to-Learn Lots

Description:

I am using a Splunk Heavy Forwarder (HF) to forward logs to an indexer cluster. I need to configure props.conf and transforms.conf on the HF to drop all logs that originate from a specific directory and any of its subdirectories, without modifying the configuration each time a new subdirectory is created.

Scenario:

The logs I want to discard are located under /var/log/apple/. This directory contains dynamically created subdirectories, such as:

/var/log/apple/nginx/

/var/log/apple/db/intro/

/var/log/apple/some/other/depth/ 

New subdirectories are added frequently, and I cannot manually update the configuration every time.

Attempted Solution: I configured props.conf as follows:

[source::/var/log/apple(/.*)?]

TRANSFORMS-null=discard_apple_logs 

And in transforms.conf:

[discard_apple_logs]

REGEX = . DEST_KEY = queue

FORMAT = nullQueue 

However, this does not seem to work, as logs from the subdirectories are still being forwarded to the indexers. Question: What is the correct way to configure props.conf and transforms.conf to drop all logs under /var/log/apple/, including those from any newly created subdirectories? How can I ensure that this rule applies recursively without explicitly listing multiple wildcard patterns? Any guidance would be greatly appreciated!

0 Karma

PickleRick
SplunkTrust
SplunkTrust

1. I suppose the easiest solution would be to just blacklist the directory within a specific inputs.conf stanza. (As others already pointed out)

2. Do your events come from monitor inputs on this HF or are they forwarded from other hosts? From HFs or UFs?

3. Ingest actions?

0 Karma

ParsaIsHash
Loves-to-Learn Lots

yes you right and i described why i can't in others answer about it

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Yes, that was my suspicion.

Your general idea seems ok (provided that your transform definition contains separate lines which just squished into one on copy-paste).

Additional question - aren't you by amy chance using indexed extractions?

If you are, data is sent as parsed and is not procesed by transforms further down the pipeline.

0 Karma

ParsaIsHash
Loves-to-Learn Lots

The only thing happening on this Heavy Forwarder is collecting logs, assigning an index based on the source using transforms.conf and props.conf, and then forwarding them to the indexer cluster.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

OK. So your transform assigning index based on source does work for the same data?

0 Karma

ParsaIsHash
Loves-to-Learn Lots

Yes, it's working correctly. For example, I am reindexing /var/log/syslog to index=os_logs, and it applies as expected.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Well... then it should work.

One thing you could change in your spec is dropping the conditionality at the end (you should never have the directory specified as the source,  just files from below this directory) but that's not the issue here.

I noticed one thing though - a similar case as we had not long ago in another thread - your transform class has a name "null". That is a fairly common name so it might be getting overriden somewhere else in your configs. See the btool output if it isn't.

0 Karma

ParsaIsHash
Loves-to-Learn Lots

I have already checked, and the transform configuration is correct with no conflicts in other Splunk settings.

Currently, to filter out sources properly, I have to explicitly define each depth of subdirectories using patterns like:

[source::/var/log/apple/*]

TRANSFORMS-null=discard_apple_logs

[source::/var/log/apple/*/*]

TRANSFORMS-null=discard_apple_logs

This ensures that logs from different levels of subdirectories are included in the filtering process.

it's quite strange that Splunk can't handle this scenario if that's the case. Use cases like mine should be fairly common, so I would expect a more straightforward way to handle this.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Well, to be quite precise, it's not a raw regex.  The docs say:

Match expressions must match the entire name, not just a substring. Match
expressions are based on a full implementation of Perl-compatible regular
expressions (PCRE) with the translation of "...", "*", and "." Thus, "."
matches a period, "*" matches non-directory separators, and "..." matches
any number of any characters.

So in case of wildcards it can get tricky.

I'd try

[source::/var/log/apple/...]
0 Karma

kiran_panchavat
SplunkTrust
SplunkTrust

@ParsaIsHash  you can use inputs.conf with a blacklist to prevent unwanted files from being forwarded at the source level (on the Heavy Forwarder). This approach stops logs from even being read, which is more efficient than filtering them in props.conf and transforms.conf.

blacklist = <regular expression>
* If set, files from this input are NOT monitored if their path matches the
  specified regex.
* Takes precedence over the deprecated '_blacklist' setting, which functions
  the same way.
* If a file matches the regexes in both the deny list and allow list settings,
  the file is NOT monitored. Deny lists take precedence over allow lists.
* No default.
Did this help? If yes, please consider giving kudos, marking it as the solution, or commenting for clarification — your feedback keeps the community going!
0 Karma

ParsaIsHash
Loves-to-Learn Lots

i described that why i cannot access the inputs file on UF

its because of that we do not have permission to access the host

0 Karma

isoutamo
SplunkTrust
SplunkTrust
Any reason why you don’t use inputs.conf with blacklist on source side?
0 Karma

ParsaIsHash
Loves-to-Learn Lots

The reason I’m not using inputs.conf with a blacklist is that the hosts sending these logs are managed by another company. They control the Universal Forwarders (UF) and their input configurations, meaning we don’t have access to modify them. However, we still need to mask and drop these logs at our end.

0 Karma
Get Updates on the Splunk Community!

CX Day is Coming!

Customer Experience (CX) Day is on October 7th!! We're so excited to bring back another day full of wonderful ...

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

The Big One: Splunk 10 is Here!  The moment many of you have been waiting for has arrived! We are thrilled to ...

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console

Today, we’re excited to announce the release of a brand new AI assistant usage dashboard in Cloud Monitoring ...