Splunk Search

SEDCMD regular expression question

adamfrisbee
Explorer

Okay you regexperts, I need some help. I have a .csv file for which I need to mask the credit card numbers. Here is what it looks like (with all fake data and cc number)

user,first_name,last_name,email,cc_type,cc_no
bfiltness0,Bria,Filtness,bfiltness0@sayntec.com,jcb,3543149367325423

I've been trying to build my own regex expression, but with no luck. I would just like to replace the credit card number with xxxx. Any help would be greatly appreciated!

0 Karma
1 Solution

woodcock
Esteemed Legend

Like this:

[csv]
SEDCMD-YourSourcetypeHere_obscure_CCs = s/\d+$/x{4}/g

View solution in original post

0 Karma

adamfrisbee
Explorer

Neither of these mask the data, though. I must be doing something wrong. This is my props.conf

[csv]
SEDCMD-mask = s/\d+$/x{4}/g
0 Karma

woodcock
Esteemed Legend

If you are sure that your settings are correct, it must be something else. If you are doing a sourcetype override/overwrite, you must use the ORIGINAL value, NOT the new value. You must deploy your settings to the first full instance(s) of Splunk that handle the events (usually either the HF tier if you use one, or else your Indexer tier) UNLESS you are using HEC's JSON endpoint (it gets pre-cooked) or INDEXED_EXTRACTIONS (configs go on the UF in that case), then restart all Splunk instances there. When (re)evaluating, you must send in new events (old events will stay broken), then test using _index_earliest=-5m to be absolutely certain that you are only examining the newly indexed events.

0 Karma

adamfrisbee
Explorer

It seems to be masking it when I look at the raw data, but I can still, for example, do | table cc_no and display all the CC numbers.

0 Karma

adamfrisbee
Explorer

Hi @woodcock,

I have verified that the the data coming in is hitting a HF first, then forwarding to a search head. When the data gets to the search head, I can see that it's replacing the cc number in the raw event (when I "show source" it does not show the cc number). However, cc_no still shows up as a field with populated values. In the images below, I've replaced the cc number with the string "secret" using your recommended sed. The first image is the raw data.

alt text

alt text

0 Karma

adamfrisbee
Explorer

Okay, I didn't have the inputs.conf stanza configured correctly. Thanks for your help.

0 Karma

woodcock
Esteemed Legend

Like this:

[csv]
SEDCMD-YourSourcetypeHere_obscure_CCs = s/\d+$/x{4}/g

View solution in original post

0 Karma

outis
New Member

If your credit card is not define with 16 number. You can try replace:
SEDCMD-cc_replacement = s/\,(\d{16})/xxxx/g
to
SEDCMD-cc_replacement = s/\,(\d+)/\,xxxx/g

follow oscar84x said

0 Karma

oscar84x
Contributor

Try this sedcmd in your props under your sourcetype, or you could also specify it by host or source. This will take the 16 digit number and replace it with xxx.

SEDCMD-cc_replacement = s/\,(\d{16})/xxx/g

https://docs.splunk.com/Documentation/Splunk/latest/Data/Anonymizedata

0 Karma

adamfrisbee
Explorer

Thank you! That seemed to partially work. It's masking it in some places.

alt text

0 Karma

woodcock
Esteemed Legend

Yours drops the last comma.

0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.