Getting Data In

Yet another filtering problem. Many transforms in transforms.conf not filtering

Path Finder

I am trying to filter with many transform statements. I believe everything is configured correctly. But I get ALL events indexed. None are going to the nullQueue. Please help! It seems easy enough but I am not getting this.

As I understand it:

In my props.conf I have two sections. One is [mod_security] it does a few things for parsing the event and collects some fields.

Also in my props.conf I have a [source::/some/path/to/directory/]. This has all the transform statements that are checking if it should send the event to the nullQueue.

What happens is all events are indexed. None are filtered. I have also tried adding all the transform statements to the [mod_security] section. But I get the same, everything is indexed, issue.

Last I wanted to check that a couple of strings exist in the event after filtering. If they do then index the event. If not send the event to the nullQueue.

Here are my props.conf and transforms.conf. Any help would be appreciated.


Props.conf

[mod_security]
TRUNCATE = 0 
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE = (--[a-z0-9]+-A--)
REPORT-get = get
REPORT-post = post
REPORT-severity = severity


[source::/nas/log/cache/httpd-www-80/]
TRANSFORMS-f1  = f1
TRANSFORMS-f2  = f2
TRANSFORMS-f3  = f3
TRANSFORMS-f4  = f4
TRANSFORMS-f5  = f5
TRANSFORMS-f6  = f6
TRANSFORMS-ok = null,ok

=========================================
transforms.conf


[get]
REGEX = (GET.+?)$
FORMAT = AA_get::$1

[post]
REGEX = (POST.+?)$
FORMAT = AA_post::$1

[severity]
REGEX = (severity.+?)\]
FORMAT = AA_severity::$1

[f1]
REGEX = (99\.99\.99\.38)
DEST_KEY = queue
FORMAT = nullQueue

[f2]
REGEX = .*404\sNot\sFound.*
DEST_KEY = queue
FORMAT = nullQueue

[f3]
REGEX = .*401\sUnauthorized.*
DEST_KEY = queue
FORMAT = nullQueue

[f4]
REGEX = .*500\sInternal\sServer\sError.*
DEST_KEY = queue
FORMAT = nullQueue

[f5]
REGEX = .*403\sForbidden.*
DEST_KEY = queue
FORMAT = nullQueue

[f6]
REGEX = FILTERED\sTO\s
DEST_KEY = queue
FORMAT = nullQueue

[null]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue

[ok]
REGEX = (Pattern\smatch)|(Matched\ssignature)
DEST_KEY = queue
FORMAT = indexQueue
 
Tags (1)
0 Karma
1 Solution

Path Finder

Yes there is a solution. I found nothing anywhere describing this in splunk.com.

My issue of not being able to search into the event deep enough at index time was solved by using the simple command LOOKAHEAD in transforms.conf. Turns out splunk does not look far ahead related to REGEX at all when indexing. It seems to only be looking for the end ofthe transaction as a priority.

Here are my working props.conf and transforms.conf:


transforms.conf

[nomore]
LOOKAHEAD = 100000
REGEX=(?m)(404\sNot\sFound)
DEST_KEY=queue
FORMAT=nullQueue


props.conf

[mod_security]
SHOULD_LINEMERGE = true
MUST_NOT_BREAK_AFTER = (--[a-z0-9]+-A--)
MUST_BREAK_AFTER = (--[a-z0-9]+-Z--)
TRUNCATE = 0
TRANSFORMS-notfounderror = nomore

At least it is working now!

View solution in original post

Path Finder

Yes there is a solution. I found nothing anywhere describing this in splunk.com.

My issue of not being able to search into the event deep enough at index time was solved by using the simple command LOOKAHEAD in transforms.conf. Turns out splunk does not look far ahead related to REGEX at all when indexing. It seems to only be looking for the end ofthe transaction as a priority.

Here are my working props.conf and transforms.conf:


transforms.conf

[nomore]
LOOKAHEAD = 100000
REGEX=(?m)(404\sNot\sFound)
DEST_KEY=queue
FORMAT=nullQueue


props.conf

[mod_security]
SHOULD_LINEMERGE = true
MUST_NOT_BREAK_AFTER = (--[a-z0-9]+-A--)
MUST_BREAK_AFTER = (--[a-z0-9]+-Z--)
TRUNCATE = 0
TRANSFORMS-notfounderror = nomore

At least it is working now!

View solution in original post

Splunk Employee
Splunk Employee

You probably need:

[source::/nas/log/cache/httpd-www-80/*]

instead of

[source::/nas/log/cache/httpd-www-80/]

Path Finder

I made the change and it males no deference. I still get ALL events indexed. It seems this is a wide spread problem for folks, It would be nice if splunk had a little better method for filtering since there is a lot of noise in many logs.

I am at an impasse with this. For the volume of data, we need to filter to make better use of analysis time and system resources.

0 Karma