Re: remove specific data from csv data

chchanda · ‎07-16-2021

Hi There,

I have ingested the csv file via Splunk UF and I want to remove certain events that contains same field value, for example, field1 = xyz, abc, pqr,.... field2 = xyz

I want to send the data to null queue if field1 = xyz and field2 = xyz

This is my props.conf :

[<sourcetype>]
CHARSET = UTF-8
SHOULD_LINEMERGE = false
NO_BINARY_CHECK = true
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
INDEXED_EXTRACTIONS = csv
KV_MODE = none
category = Structured
disabled = false
pulldown_type = true

Any help would be appreciated. Thanks

richgalloway · ‎07-16-2021

Removing events must be done by an indexer or heavy forwarder. It can't be done by a UF.

You'll need a transform that tells Splunk which events to discard. See https://docs.splunk.com/Documentation/Splunk/8.2.1/Forwarding/Routeandfilterdatad#Filter_event_data_... and https://community.splunk.com/t5/Getting-Data-In/Filtering-events-using-NullQueue/m-p/66392 for how to do that.

---
If this reply helps you, Karma would be appreciated.

chchanda · ‎07-20-2021

richgalloway, I am able to send the empty records to null queue but I also want to send few more to null queue(as cited in the query above). I have ingested the data using UF itself.

richgalloway · ‎07-20-2021

Data that is already ingested can't be removed until it expires. You can hide events with the delete command, if you have that capability.

Since you already have the ability to send data to nullQueue, sending more there is just a matter of tweaking your regex or adding another transform.

---
If this reply helps you, Karma would be appreciated.

chchanda · ‎07-20-2021

@richgalloway Here I cannot use the regex because the field1 value and field2 values are same. for example,

field1	field2
abc	abc
xyz	abc
pqr	abc
qwe	abc

I want to send the data where field1 = field2. can you please suggest?

richgalloway · ‎07-20-2021

Depending on the props.conf settings, the transform likely is applied before field extractions so the regex should be based on the raw data rather than on individual fields.

---
If this reply helps you, Karma would be appreciated.

chchanda · ‎07-20-2021

@richgalloway

This is my props.conf :

[<sourcetype>]
CHARSET = UTF-8
SHOULD_LINEMERGE = false
NO_BINARY_CHECK = true
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
INDEXED_EXTRACTIONS = csv
KV_MODE = none
category = Structured
disabled = false
pulldown_type = true
TRANSFORMS-set = eliminate-null_data
description=Comma-separated value format. Set header and other settings in "Delimited Settings"

transforms.conf:

[eliminate-null_data]
REGEX=<my regex>
DEST_KEY = queue
FORMAT = nullQueue

Note: the data flow can be stopped in order to send the desired data to null queue.

richgalloway · ‎07-20-2021

I'm not sure how well Splunk supports it, but you can do matching in regex by using back references. This expression will match if field1 and field2 have the same value.

field1=(\w+), field2=\1

---
If this reply helps you, Karma would be appreciated.

remove specific data from csv data

CSV

universal forwarder

Unlock Database Monitoring with Splunk Observability Cloud

Purpose in Action: How Splunk Is Helping Power an Inclusive Future for All

[Upcoming Webinar] Demo Day: Transforming IT Operations with Splunk

Join the Conversation

remove specific data from csv data

CSV

universal forwarder

Unlock Database Monitoring with Splunk Observability Cloud

Purpose in Action: How Splunk Is Helping Power an Inclusive Future for All

[Upcoming Webinar] Demo Day: Transforming IT Operations with Splunk