Getting Data In

Yet another filtering problem. Many transforms in transforms.conf not filtering

neusse
Path Finder

I am trying to filter with many transform statements. I believe everything is configured correctly. But I get ALL events indexed. None are going to the nullQueue. Please help! It seems easy enough but I am not getting this.

As I understand it:

In my props.conf I have two sections. One is [mod_security] it does a few things for parsing the event and collects some fields.

Also in my props.conf I have a [source::/some/path/to/directory/]. This has all the transform statements that are checking if it should send the event to the nullQueue.

What happens is all events are indexed. None are filtered. I have also tried adding all the transform statements to the [mod_security] section. But I get the same, everything is indexed, issue.

Last I wanted to check that a couple of strings exist in the event after filtering. If they do then index the event. If not send the event to the nullQueue.

Here are my props.conf and transforms.conf. Any help would be appreciated.


Props.conf

[mod_security]
TRUNCATE = 0 
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE = (--[a-z0-9]+-A--)
REPORT-get = get
REPORT-post = post
REPORT-severity = severity


[source::/nas/log/cache/httpd-www-80/]
TRANSFORMS-f1  = f1
TRANSFORMS-f2  = f2
TRANSFORMS-f3  = f3
TRANSFORMS-f4  = f4
TRANSFORMS-f5  = f5
TRANSFORMS-f6  = f6
TRANSFORMS-ok = null,ok

=========================================
transforms.conf


[get]
REGEX = (GET.+?)$
FORMAT = AA_get::$1

[post]
REGEX = (POST.+?)$
FORMAT = AA_post::$1

[severity]
REGEX = (severity.+?)\]
FORMAT = AA_severity::$1

[f1]
REGEX = (99\.99\.99\.38)
DEST_KEY = queue
FORMAT = nullQueue

[f2]
REGEX = .*404\sNot\sFound.*
DEST_KEY = queue
FORMAT = nullQueue

[f3]
REGEX = .*401\sUnauthorized.*
DEST_KEY = queue
FORMAT = nullQueue

[f4]
REGEX = .*500\sInternal\sServer\sError.*
DEST_KEY = queue
FORMAT = nullQueue

[f5]
REGEX = .*403\sForbidden.*
DEST_KEY = queue
FORMAT = nullQueue

[f6]
REGEX = FILTERED\sTO\s
DEST_KEY = queue
FORMAT = nullQueue

[null]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue

[ok]
REGEX = (Pattern\smatch)|(Matched\ssignature)
DEST_KEY = queue
FORMAT = indexQueue
 
Tags (1)
0 Karma
1 Solution

neusse
Path Finder

Yes there is a solution. I found nothing anywhere describing this in splunk.com.

My issue of not being able to search into the event deep enough at index time was solved by using the simple command LOOKAHEAD in transforms.conf. Turns out splunk does not look far ahead related to REGEX at all when indexing. It seems to only be looking for the end ofthe transaction as a priority.

Here are my working props.conf and transforms.conf:


transforms.conf

[nomore]
LOOKAHEAD = 100000
REGEX=(?m)(404\sNot\sFound)
DEST_KEY=queue
FORMAT=nullQueue


props.conf

[mod_security]
SHOULD_LINEMERGE = true
MUST_NOT_BREAK_AFTER = (--[a-z0-9]+-A--)
MUST_BREAK_AFTER = (--[a-z0-9]+-Z--)
TRUNCATE = 0
TRANSFORMS-notfounderror = nomore

At least it is working now!

View solution in original post

neusse
Path Finder

Yes there is a solution. I found nothing anywhere describing this in splunk.com.

My issue of not being able to search into the event deep enough at index time was solved by using the simple command LOOKAHEAD in transforms.conf. Turns out splunk does not look far ahead related to REGEX at all when indexing. It seems to only be looking for the end ofthe transaction as a priority.

Here are my working props.conf and transforms.conf:


transforms.conf

[nomore]
LOOKAHEAD = 100000
REGEX=(?m)(404\sNot\sFound)
DEST_KEY=queue
FORMAT=nullQueue


props.conf

[mod_security]
SHOULD_LINEMERGE = true
MUST_NOT_BREAK_AFTER = (--[a-z0-9]+-A--)
MUST_BREAK_AFTER = (--[a-z0-9]+-Z--)
TRUNCATE = 0
TRANSFORMS-notfounderror = nomore

At least it is working now!

gkanapathy
Splunk Employee
Splunk Employee

You probably need:

[source::/nas/log/cache/httpd-www-80/*]

instead of

[source::/nas/log/cache/httpd-www-80/]

neusse
Path Finder

I made the change and it males no deference. I still get ALL events indexed. It seems this is a wide spread problem for folks, It would be nice if splunk had a little better method for filtering since there is a lot of noise in many logs.

I am at an impasse with this. For the volume of data, we need to filter to make better use of analysis time and system resources.

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...