Getting Data In

How to configure props.conf and transforms.conf to filter out events from web logs before getting indexed?

Motivator

I am attempting to filter out healthcheck's within our system from our web logs. I am using the props.conf / transforms.conf method on the indexer in order to begin filtering these out. I have inserted all of the necessary parameters, and restarted the system, however the events are still appearing. This environment is a cluster, so I am pushing the configurations from the cluster master. See props / transforms entries below.

props.conf:

[www_stg] 
TRANSFORMS-hck = webHealthCheckFilterStg 

transforms.conf:

[webHealthCheckFilterStg] 
    REGEX = 10.207.*.* 
    REGEX = 10.207.*.* 
    REGEX = 10.207.*.*
    REGEX = 10.207.*.* 
    REGEX = myHealtchCheckUser
    REGEX = (h|H)ealth(c|C)heck) 
    DEST_KEY = queue 
    FORMAT = nullQueue

I don't see the events getting filtered, so I'm assuming I have a syntax error somewhere. Am I doing something wrong?

0 Karma
1 Solution

Splunk Employee
Splunk Employee

First, I have 3 remarks on the regexes.

  • you cannot put more than 1 REGEX per transfoms.
  • your regex are not valid, you need to escape the dots and you may want to specify that you expect digits
    REGEX=10\.207\.\d+\.\d+

  • you should group all your regex in a single one with OR conditions. This can be faster than create a new tranform per regex.
    example : REGEX=(myHealtchCheckUser|10\.207\.\d+\.\d+)

And finally, double check that :

  • the events are not send from a heavy forwarder that already parsed the events
  • that the sourcetype does not have any INDEXED_EXTRACTIONS rules that will cause them to be parsed on the forwarders In both case, try with the props/transforms on the forwarder.

View solution in original post

Splunk Employee
Splunk Employee

First, I have 3 remarks on the regexes.

  • you cannot put more than 1 REGEX per transfoms.
  • your regex are not valid, you need to escape the dots and you may want to specify that you expect digits
    REGEX=10\.207\.\d+\.\d+

  • you should group all your regex in a single one with OR conditions. This can be faster than create a new tranform per regex.
    example : REGEX=(myHealtchCheckUser|10\.207\.\d+\.\d+)

And finally, double check that :

  • the events are not send from a heavy forwarder that already parsed the events
  • that the sourcetype does not have any INDEXED_EXTRACTIONS rules that will cause them to be parsed on the forwarders In both case, try with the props/transforms on the forwarder.

View solution in original post