Hi There,
I have ingested the csv file via Splunk UF and I want to remove certain events that contains same field value, for example, field1 = xyz, abc, pqr,.... field2 = xyz
I want to send the data to null queue if field1 = xyz and field2 = xyz
This is my props.conf :
[<sourcetype>]
CHARSET = UTF-8
SHOULD_LINEMERGE = false
NO_BINARY_CHECK = true
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
INDEXED_EXTRACTIONS = csv
KV_MODE = none
category = Structured
disabled = false
pulldown_type = true
Any help would be appreciated. Thanks
Removing events must be done by an indexer or heavy forwarder. It can't be done by a UF.
You'll need a transform that tells Splunk which events to discard. See https://docs.splunk.com/Documentation/Splunk/8.2.1/Forwarding/Routeandfilterdatad#Filter_event_data_... and https://community.splunk.com/t5/Getting-Data-In/Filtering-events-using-NullQueue/m-p/66392 for how to do that.
richgalloway, I am able to send the empty records to null queue but I also want to send few more to null queue(as cited in the query above). I have ingested the data using UF itself.
Data that is already ingested can't be removed until it expires. You can hide events with the delete command, if you have that capability.
Since you already have the ability to send data to nullQueue, sending more there is just a matter of tweaking your regex or adding another transform.
@richgalloway Here I cannot use the regex because the field1 value and field2 values are same. for example,
field1 | field2 |
abc | abc |
xyz | abc |
pqr | abc |
qwe | abc |
I want to send the data where field1 = field2. can you please suggest?
Depending on the props.conf settings, the transform likely is applied before field extractions so the regex should be based on the raw data rather than on individual fields.
This is my props.conf :
[<sourcetype>]
CHARSET = UTF-8
SHOULD_LINEMERGE = false
NO_BINARY_CHECK = true
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
INDEXED_EXTRACTIONS = csv
KV_MODE = none
category = Structured
disabled = false
pulldown_type = true
TRANSFORMS-set = eliminate-null_data
description=Comma-separated value format. Set header and other settings in "Delimited Settings"
transforms.conf:
[eliminate-null_data]
REGEX=<my regex>
DEST_KEY = queue
FORMAT = nullQueue
Note: the data flow can be stopped in order to send the desired data to null queue.
I'm not sure how well Splunk supports it, but you can do matching in regex by using back references. This expression will match if field1 and field2 have the same value.
field1=(\w+), field2=\1