Splunk IT Service Intelligence

Splunk ITSI: Why is transforms not masking the data correctly?

iamlearner123
Explorer

Hello,

I am new to splunk and learning it. However, recently i wrote transform to mask the mail ID but when i tested it is not masking the mail ID.

Transform:

[mail_id_mask]
REGEX = ([A-z0-9._%+-]+@[A-z0-9.-]+\.[A-z]{2,63})
FORMAT = ********@*********
DEST_KEY = _raw

Sample logs:

(29.2) 01-27-17 02:53:27 (9866:8500)  PRINTINGFN: $G_NOTIFY12_GRP_INTERNAL: abcdef.sdfrwe56@xyz.com
 (29.2) 01-27-17 02:53:27 (9866:8500)  PRINTINGFN: $G_NOTIF123Y_GRP_EXTERNAL: corP-apachesci.com

Any help would be appreciated.

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

Hi,

You can use SEDCMD in props.conf to achieve this easily.

Please try below config in props.conf on Heavy Forwarder/Indexer whichever comes first from Universal Forwarder.

[yoursourcetype]
SEDCMD-mailidmask = s/^(\N+[\:]\s)[^\@]+\@[^\n]+/\1XXXX/
0 Karma

iamlearner123
Explorer

Thank you for the reply. however, it is not masking the all the email ID's. it masked only the first email ID (abcdef.sdfrwe56@xyz.com) when i tested in regex101.

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

I can't see any other email ID in raw data which you have provided.

0 Karma

iamlearner123
Explorer

Like, i was testing the regex for the couple of other events. Please find the below events. When i put the below events in the regex101, masking is not working. I am trying achieve a dynamic regex that will work for any email ID.

(14.2) 01-27-19 02:53:28 (8544:8500) PRINTFN: $G_NOTIFY_GRP_INTERNAL: harry.peter07@abc.com
(14.2) 01-27-18 02:53:27 (8544:8500) PRINTFN: $G_NOTIFY_GRP_EXTERNAL: SAAS-Learning_MDA@trsq.com

Thanks

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

Here you go https://regex101.com/r/yHxyYg/1 , it is working fine

0 Karma

ddrillic
Ultra Champion

If we look at Anonymizing Data in Splunk

You see the following -

In this approach, a TRANSFORMS statement is called in the props.conf file and is applied to the data in the queues before being indexed. In the example, the goal is to mask the “sensitive number" except for the last 4 digits.

—props.conf---
[hr_app]
TRANSFORMS-hr_app_logs_mask_data = mask_sn

—transforms.conf---
[mask_sn]
REGEX = (?m)^(.)SN=\d{3}-\d{2}-(\d{4}.)
DEST_KEY = _raw
FORMAT = $1SN=###-##-$2

This is the result of the sample event going through the transformation
“This is an event with a sensitive number in it. SN=###-##-1111. This should be masked”

The approach here is to match the first part of the event (.*), then the part to be masked (SN=…), then the last 4 digits and the rest of the event. These last two parts are to be retained when the event data is written back out to the "_raw" field specified by the "DEST_KEY." Note that the “FORMAT” setting specifies how the event will be re-written. The "$1" and "$2" refer to the two capturing groups in the "REGEX" field.

Meaning, the REGEX captures the entire event, breaks it up to multiple capturing groups and then reconstructs the event.

0 Karma

pruthvikrishnap
Contributor

try rechecking the regex and also try excluding the FORMAT and see if that works

0 Karma

iamlearner123
Explorer

Regex is working fine but splunk is replacing the entire event with the **********************

Thanks

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...