Getting Data In

Mask a particular field in csv data at Index time

divyanshukakwan
Explorer

I have a csv data that contains some sensitive information like client ip. Here is how one of the rows of the data looks:

David, London,...several more columns...,192.168.0.1

What I want is to mask the IP replacing it with the string "XXXXXXX" so that it produces, for the above row:

David, London, ...several more columns..., XXXXXXX

Also, this operation needs to be performed at index-time.

I have tried setting up transforms in prop.conf and transform.conf:

[source::data.csv]
TRANSFORMS-masking = pii-mask

[pii-mask]
REGEX = .*
FORMAT = ClientIP::XXXXXX
SOURCE_KEY = ClientIP
DEST_KEY = ClientIP

However, even after doing this, the IP still comes up. Can anybody tell me how to fix this issue?

It seems to me that the fields have not been extracted when the transforms are run. If this is the case, how should I get extraction done before transformation?

Edit:
One of the columns in the data is address. This field can contain arbitrary number of commas, for example: "#221, Baker Street, London, England". So, I can't use a simple regular expression in sed. Instead, what I want to know is how to do transforms on extracted field rather than on the _raw field.

0 Karma

493669
Super Champion

hey,
if you want to do masking on extracted field while displaying then you can use below spl query:

<base search>| replace * WITH XXXXXX IN ClientIP
0 Karma

divyanshukakwan
Explorer

I want to do masking at index time, not search time

0 Karma

somesoni2
Revered Legend

Try this (need the serial no at which the field appears on CSV, I'm assuming 15, adjust accordingly)

[source::data.csv]
SEDCMD-masking = s/^(([^\,]+,){14})(\d+\.\d+\.\d+\.\d+)/\1XX.XX.XX.XX/
0 Karma

divyanshukakwan
Explorer

Ya, this will work. But, I want to operate on the extracted field rather than on the _raw key. Is there any way I can use an extracted field in my SOURCE_KEY attribute?

0 Karma

493669
Super Champion

Try following simple SED example here: https://docs.splunk.com/Documentation/Splunk/latest/Data/Anonymizedata#Anonymize_data_through_a_sed_...

transforms.conf -

[ClientIP-anonymizer]
REGEX = (?m)^(.*)ClientIP=\d+\.\d+\.\d+\.\d+(.*)$
FORMAT = $1ClientIP=########$2
DEST_KEY = _raw

props.conf -

[source::data.csv]
TRANSFORMS-anonymize = ClientIP-anonymizer

Hope this helps!

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...

Index This | How many sevens are there between 1 and 100?

August 2025 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...