Getting Data In

Filter events on UF

chrisyounger
SplunkTrust
SplunkTrust

I have a data source of significant size and I want to filter a large percentage of the data on the UF so it isnt sent to the Splunk indexers. How can this be done?

0 Karma
1 Solution

chrisyounger
SplunkTrust
SplunkTrust

Yes this is possible by using force_local_processing=true

 

force_local_processing = <boolean>
* Forces a universal forwarder to process all data tagged with this sourcetype
locally before forwarding it to the indexers.
* Data with this sourcetype is processed by the linebreaker,
aggerator, and the regexreplacement processors in addition to the existing
utf8 processor.
* Note that switching this property potentially increases the cpu
and memory consumption of the forwarder.
* Applicable only on a universal forwarder.
* Default: false


You should carefully consider if this option is right for you before deploying it. Read and understand the warning in the spec file (above). By parsing on a UF you are creating a "special snowflake" in your environment where data is parsed somewhere unusual.


Props.conf

[my_sourcetype]
# Use with caution. In most cases its best to let the the parsing occur on a Splunk enterprise server
force_local_processing = true
LINE_BREAKER = ([\r\n]+)
SHOULD_LINEMERGE = false
MAX_TIMESTAMP_LOOKAHEAD = ...
TIME_FORMAT = ...
TIME_PREFIX = ^
TRANSFORMS = my_sourcetype_dump_extra_events


Transforms.conf

[my_sourcetype_dump_extra_events]
REGEX = discard_events_that_match_this_regex
DEST_KEY = queue
FORMAT = nullQueue

Note that if you want to nullqueue/discard all events EXCEPT for those that match a regular expression, the usual documented method won't work (as far as my testing has revealed): https://docs.splunk.com/Documentation/Splunk/latest/Forwarding/Routeandfilterdatad#Filter_event_data...

You will instead need to use a negative assertion REGEX like so:

[my_sourcetype_dump_extra_events]
REGEX = ^((?!keep_events_that_match_this_regex).)*$
DEST_KEY = queue
FORMAT = nullQueue

In my testing, discard events on UF's using force_local_processing and a negative assertion caused no measurable increase in CPU, Memory, Disk IO or Network traffic. I used the below query to check how much data was being sent from the UF to the indexers, and it showed a huge reduction:

| mstats sum(spl.mlog.tcpin_connections.kb) as kb where index=_metrics group="tcpin_connections" fwdType="uf" hostname=UF_NAME span=5m | timechart span=5m sum(kb)

 

View solution in original post

chrisyounger
SplunkTrust
SplunkTrust

Yes this is possible by using force_local_processing=true

 

force_local_processing = <boolean>
* Forces a universal forwarder to process all data tagged with this sourcetype
locally before forwarding it to the indexers.
* Data with this sourcetype is processed by the linebreaker,
aggerator, and the regexreplacement processors in addition to the existing
utf8 processor.
* Note that switching this property potentially increases the cpu
and memory consumption of the forwarder.
* Applicable only on a universal forwarder.
* Default: false


You should carefully consider if this option is right for you before deploying it. Read and understand the warning in the spec file (above). By parsing on a UF you are creating a "special snowflake" in your environment where data is parsed somewhere unusual.


Props.conf

[my_sourcetype]
# Use with caution. In most cases its best to let the the parsing occur on a Splunk enterprise server
force_local_processing = true
LINE_BREAKER = ([\r\n]+)
SHOULD_LINEMERGE = false
MAX_TIMESTAMP_LOOKAHEAD = ...
TIME_FORMAT = ...
TIME_PREFIX = ^
TRANSFORMS = my_sourcetype_dump_extra_events


Transforms.conf

[my_sourcetype_dump_extra_events]
REGEX = discard_events_that_match_this_regex
DEST_KEY = queue
FORMAT = nullQueue

Note that if you want to nullqueue/discard all events EXCEPT for those that match a regular expression, the usual documented method won't work (as far as my testing has revealed): https://docs.splunk.com/Documentation/Splunk/latest/Forwarding/Routeandfilterdatad#Filter_event_data...

You will instead need to use a negative assertion REGEX like so:

[my_sourcetype_dump_extra_events]
REGEX = ^((?!keep_events_that_match_this_regex).)*$
DEST_KEY = queue
FORMAT = nullQueue

In my testing, discard events on UF's using force_local_processing and a negative assertion caused no measurable increase in CPU, Memory, Disk IO or Network traffic. I used the below query to check how much data was being sent from the UF to the indexers, and it showed a huge reduction:

| mstats sum(spl.mlog.tcpin_connections.kb) as kb where index=_metrics group="tcpin_connections" fwdType="uf" hostname=UF_NAME span=5m | timechart span=5m sum(kb)

 

Get Updates on the Splunk Community!

Splunk Enterprise Security: Your Command Center for PCI DSS Compliance

Every security professional knows the drill. The PCI DSS audit is approaching, and suddenly everyone's asking ...

Developer Spotlight with Guilhem Marchand

From Splunk Engineer to Founder: The Journey Behind TrackMe    After spending over 12 years working full time ...

Cisco Catalyst Center Meets Splunk ITSI: From 'Payments Are Down' to Root Cause in ...

The Problem: When Networks and Services Don't Talk Payment systems fail at a retail location. Customers are ...