Splunk Search

SEDCMD regular expression question

adamfrisbee
Explorer

Okay you regexperts, I need some help. I have a .csv file for which I need to mask the credit card numbers. Here is what it looks like (with all fake data and cc number)

user,first_name,last_name,email,cc_type,cc_no
bfiltness0,Bria,Filtness,bfiltness0@sayntec.com,jcb,3543149367325423

I've been trying to build my own regex expression, but with no luck. I would just like to replace the credit card number with xxxx. Any help would be greatly appreciated!

0 Karma
1 Solution

woodcock
Esteemed Legend

Like this:

[csv]
SEDCMD-YourSourcetypeHere_obscure_CCs = s/\d+$/x{4}/g

View solution in original post

0 Karma

adamfrisbee
Explorer

Neither of these mask the data, though. I must be doing something wrong. This is my props.conf

[csv]
SEDCMD-mask = s/\d+$/x{4}/g
0 Karma

woodcock
Esteemed Legend

If you are sure that your settings are correct, it must be something else. If you are doing a sourcetype override/overwrite, you must use the ORIGINAL value, NOT the new value. You must deploy your settings to the first full instance(s) of Splunk that handle the events (usually either the HF tier if you use one, or else your Indexer tier) UNLESS you are using HEC's JSON endpoint (it gets pre-cooked) or INDEXED_EXTRACTIONS (configs go on the UF in that case), then restart all Splunk instances there. When (re)evaluating, you must send in new events (old events will stay broken), then test using _index_earliest=-5m to be absolutely certain that you are only examining the newly indexed events.

0 Karma

adamfrisbee
Explorer

It seems to be masking it when I look at the raw data, but I can still, for example, do | table cc_no and display all the CC numbers.

0 Karma

adamfrisbee
Explorer

Hi @woodcock,

I have verified that the the data coming in is hitting a HF first, then forwarding to a search head. When the data gets to the search head, I can see that it's replacing the cc number in the raw event (when I "show source" it does not show the cc number). However, cc_no still shows up as a field with populated values. In the images below, I've replaced the cc number with the string "secret" using your recommended sed. The first image is the raw data.

alt text

alt text

0 Karma

adamfrisbee
Explorer

Okay, I didn't have the inputs.conf stanza configured correctly. Thanks for your help.

0 Karma

woodcock
Esteemed Legend

Like this:

[csv]
SEDCMD-YourSourcetypeHere_obscure_CCs = s/\d+$/x{4}/g
0 Karma

outis
New Member

If your credit card is not define with 16 number. You can try replace:
SEDCMD-cc_replacement = s/\,(\d{16})/xxxx/g
to
SEDCMD-cc_replacement = s/\,(\d+)/\,xxxx/g

follow oscar84x said

0 Karma

oscar84x
Contributor

Try this sedcmd in your props under your sourcetype, or you could also specify it by host or source. This will take the 16 digit number and replace it with xxx.

SEDCMD-cc_replacement = s/\,(\d{16})/xxx/g

https://docs.splunk.com/Documentation/Splunk/latest/Data/Anonymizedata

0 Karma

adamfrisbee
Explorer

Thank you! That seemed to partially work. It's masking it in some places.

alt text

0 Karma

woodcock
Esteemed Legend

Yours drops the last comma.

0 Karma
Get Updates on the Splunk Community!

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...