Okay you regexperts, I need some help. I have a .csv file for which I need to mask the credit card numbers. Here is what it looks like (with all fake data and cc number)
user,first_name,last_name,email,cc_type,cc_no
bfiltness0,Bria,Filtness,bfiltness0@sayntec.com,jcb,3543149367325423
I've been trying to build my own regex expression, but with no luck. I would just like to replace the credit card number with xxxx. Any help would be greatly appreciated!
Like this:
[csv]
SEDCMD-YourSourcetypeHere_obscure_CCs = s/\d+$/x{4}/g
Neither of these mask the data, though. I must be doing something wrong. This is my props.conf
[csv]
SEDCMD-mask = s/\d+$/x{4}/g
If you are sure that your settings are correct, it must be something else. If you are doing a sourcetype override/overwrite, you must use the ORIGINAL value, NOT the new value. You must deploy your settings to the first full instance(s) of Splunk that handle the events (usually either the HF tier if you use one, or else your Indexer tier) UNLESS you are using HEC's JSON endpoint (it gets pre-cooked) or INDEXED_EXTRACTIONS (configs go on the UF in that case), then restart all Splunk instances there. When (re)evaluating, you must send in new events (old events will stay broken), then test using _index_earliest=-5m
to be absolutely certain that you are only examining the newly indexed events.
It seems to be masking it when I look at the raw data, but I can still, for example, do | table cc_no
and display all the CC numbers.
Hi @woodcock,
I have verified that the the data coming in is hitting a HF first, then forwarding to a search head. When the data gets to the search head, I can see that it's replacing the cc number in the raw event (when I "show source" it does not show the cc number). However, cc_no still shows up as a field with populated values. In the images below, I've replaced the cc number with the string "secret" using your recommended sed. The first image is the raw data.
Okay, I didn't have the inputs.conf stanza configured correctly. Thanks for your help.
Like this:
[csv]
SEDCMD-YourSourcetypeHere_obscure_CCs = s/\d+$/x{4}/g
If your credit card is not define with 16 number. You can try replace:
SEDCMD-cc_replacement = s/\,(\d{16})/xxxx/g
to
SEDCMD-cc_replacement = s/\,(\d+)/\,xxxx/g
follow oscar84x said
Try this sedcmd in your props under your sourcetype, or you could also specify it by host or source. This will take the 16 digit number and replace it with xxx.
SEDCMD-cc_replacement = s/\,(\d{16})/xxx/g
https://docs.splunk.com/Documentation/Splunk/latest/Data/Anonymizedata
Thank you! That seemed to partially work. It's masking it in some places.
Yours drops the last comma.