Getting Data In

How to mask emails and credit card numbers in logs?

ryangpeng
Explorer

According to the link below, it looks possible to mask data in splunk.
https://docs.splunk.com/Documentation/Splunk/6.4.3/Data/Anonymizedata

I want to mask the email and credit card number for the logs.
Here are the example for each:

test@test.com
3234-1234-5678-5678

As I need to configure props.conf and transforms.conf under $SPLUNK_HOME/etc/system/local/
Specifically, in props.conf, it will be something like this:

[<sourcetype>]
TRANSFORMS-anonymize = emailaddr-anonymizer, creditcard-anonymizer

in transforms.conf, it will be:

[emailaddr-anonymizer]
REGEX = <regex>
FORMAT = ********@*********
DEST_KEY = _raw

[creditcard-anonymizer]
REGEX = <regex>
FORMAT = ****-****-****-****
DEST_KEY = _raw

As I am not good at REGEX in Splunk, can any body tell me what exact regular expression I have to write in the REGEX field for email and credit card?
(Only need to match '@'and '.' in email field)

0 Karma
1 Solution

somesoni2
Revered Legend

Try this

in transforms.conf,

 [emailaddr-anonymizer]
 REGEX = ([A-z0-9._%+-]+@[A-z0-9.-]+\.[A-z]{2,63})
 FORMAT = ********@*********
 DEST_KEY = _raw

 [creditcard-anonymizer]
 REGEX = ((\d{4}[-|\s]*){3}\d{4})
 FORMAT = ****-****-****-****
 DEST_KEY = _raw

View solution in original post

0 Karma

somesoni2
Revered Legend

Try this

in transforms.conf,

 [emailaddr-anonymizer]
 REGEX = ([A-z0-9._%+-]+@[A-z0-9.-]+\.[A-z]{2,63})
 FORMAT = ********@*********
 DEST_KEY = _raw

 [creditcard-anonymizer]
 REGEX = ((\d{4}[-|\s]*){3}\d{4})
 FORMAT = ****-****-****-****
 DEST_KEY = _raw
0 Karma

ryangpeng
Explorer

Hi somesoni2, I have an additional question regarding this question.

I found that the whole line of the log are masked. Below is the example.

# (Input text)
Card Number #1 : 1234-5678-9012-3456

# (Actual result)
****-****-****-****

(Expected)
Card Number #1 : ****-****-****-****

I also tried to replace the format with below but get no luck.

FORMAT = $1****-****-****-****$2

What should I do if I want to get the expected result above?

0 Karma

ryangpeng
Explorer

Just to answer my question, it seems that the following can do the trick.

REGEX = (.*)([A-z0-9._%+-]+@[A-z0-9.-]+\.[A-z]{2,63})(.*)
FORMAT = $1********@*********$3

REGEX = (.*)((\d{4}[-|\s]*){3}\d{4})(.*)
FORMAT = $1****-****-****-****$3
0 Karma

ryangpeng
Explorer

Hi somesoni2, the regex worked! Thanks a lot!

0 Karma

aadedhela
Engager

Somesoni2, I am trying to test this search inline. Can you help with the direct search not in .conf propl
I would want to generate alert based on appearance of the cc numbers in logs. was trying this:
index="myIndex"
| rex "(?((\d{4}[-|\s]){3}\d{4}))"
| search possible_cc_number=

| table _time possible_cc_number _raw
so the events are showing numbers, how to use regex and formating in the same inline searches?
Thanks.

0 Karma
Get Updates on the Splunk Community!

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...

Observability protocols to know about

Observability protocols define the specifications or formats for collecting, encoding, transporting, and ...

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...