Getting Data In

Yet another filtering problem. Many transforms in transforms.conf not filtering

neusse
Path Finder

I am trying to filter with many transform statements. I believe everything is configured correctly. But I get ALL events indexed. None are going to the nullQueue. Please help! It seems easy enough but I am not getting this.

As I understand it:

In my props.conf I have two sections. One is [mod_security] it does a few things for parsing the event and collects some fields.

Also in my props.conf I have a [source::/some/path/to/directory/]. This has all the transform statements that are checking if it should send the event to the nullQueue.

What happens is all events are indexed. None are filtered. I have also tried adding all the transform statements to the [mod_security] section. But I get the same, everything is indexed, issue.

Last I wanted to check that a couple of strings exist in the event after filtering. If they do then index the event. If not send the event to the nullQueue.

Here are my props.conf and transforms.conf. Any help would be appreciated.


Props.conf

[mod_security]
TRUNCATE = 0 
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE = (--[a-z0-9]+-A--)
REPORT-get = get
REPORT-post = post
REPORT-severity = severity


[source::/nas/log/cache/httpd-www-80/]
TRANSFORMS-f1  = f1
TRANSFORMS-f2  = f2
TRANSFORMS-f3  = f3
TRANSFORMS-f4  = f4
TRANSFORMS-f5  = f5
TRANSFORMS-f6  = f6
TRANSFORMS-ok = null,ok

=========================================
transforms.conf


[get]
REGEX = (GET.+?)$
FORMAT = AA_get::$1

[post]
REGEX = (POST.+?)$
FORMAT = AA_post::$1

[severity]
REGEX = (severity.+?)\]
FORMAT = AA_severity::$1

[f1]
REGEX = (99\.99\.99\.38)
DEST_KEY = queue
FORMAT = nullQueue

[f2]
REGEX = .*404\sNot\sFound.*
DEST_KEY = queue
FORMAT = nullQueue

[f3]
REGEX = .*401\sUnauthorized.*
DEST_KEY = queue
FORMAT = nullQueue

[f4]
REGEX = .*500\sInternal\sServer\sError.*
DEST_KEY = queue
FORMAT = nullQueue

[f5]
REGEX = .*403\sForbidden.*
DEST_KEY = queue
FORMAT = nullQueue

[f6]
REGEX = FILTERED\sTO\s
DEST_KEY = queue
FORMAT = nullQueue

[null]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue

[ok]
REGEX = (Pattern\smatch)|(Matched\ssignature)
DEST_KEY = queue
FORMAT = indexQueue
 
Tags (1)
0 Karma
1 Solution

neusse
Path Finder

Yes there is a solution. I found nothing anywhere describing this in splunk.com.

My issue of not being able to search into the event deep enough at index time was solved by using the simple command LOOKAHEAD in transforms.conf. Turns out splunk does not look far ahead related to REGEX at all when indexing. It seems to only be looking for the end ofthe transaction as a priority.

Here are my working props.conf and transforms.conf:


transforms.conf

[nomore]
LOOKAHEAD = 100000
REGEX=(?m)(404\sNot\sFound)
DEST_KEY=queue
FORMAT=nullQueue


props.conf

[mod_security]
SHOULD_LINEMERGE = true
MUST_NOT_BREAK_AFTER = (--[a-z0-9]+-A--)
MUST_BREAK_AFTER = (--[a-z0-9]+-Z--)
TRUNCATE = 0
TRANSFORMS-notfounderror = nomore

At least it is working now!

View solution in original post

neusse
Path Finder

Yes there is a solution. I found nothing anywhere describing this in splunk.com.

My issue of not being able to search into the event deep enough at index time was solved by using the simple command LOOKAHEAD in transforms.conf. Turns out splunk does not look far ahead related to REGEX at all when indexing. It seems to only be looking for the end ofthe transaction as a priority.

Here are my working props.conf and transforms.conf:


transforms.conf

[nomore]
LOOKAHEAD = 100000
REGEX=(?m)(404\sNot\sFound)
DEST_KEY=queue
FORMAT=nullQueue


props.conf

[mod_security]
SHOULD_LINEMERGE = true
MUST_NOT_BREAK_AFTER = (--[a-z0-9]+-A--)
MUST_BREAK_AFTER = (--[a-z0-9]+-Z--)
TRUNCATE = 0
TRANSFORMS-notfounderror = nomore

At least it is working now!

gkanapathy
Splunk Employee
Splunk Employee

You probably need:

[source::/nas/log/cache/httpd-www-80/*]

instead of

[source::/nas/log/cache/httpd-www-80/]

neusse
Path Finder

I made the change and it males no deference. I still get ALL events indexed. It seems this is a wide spread problem for folks, It would be nice if splunk had a little better method for filtering since there is a lot of noise in many logs.

I am at an impasse with this. For the volume of data, we need to filter to make better use of analysis time and system resources.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Kick the Tires Before You Commit: A Hands-On Tour of the Splunk Observability Cloud ...

Evaluating an enterprise observability platform usually goes like this: fill out a form, get a free trial with ...

Deep insights, no barriers: Splunk Observability Cloud Free Edition

As software delivery cycles continue to accelerate, observability shouldn’t be a luxury — it should be a ...

Monitoring AI Agents with Splunk Observability Cloud

Let’s say I’m running a travel planning AI app in production. A user asks for three concise hotel options in ...