Splunk ITSI

Splunk ITSI: Why is transforms not masking the data correctly?

iamlearner123
Explorer

Hello,

I am new to splunk and learning it. However, recently i wrote transform to mask the mail ID but when i tested it is not masking the mail ID.

Transform:

[mail_id_mask]
REGEX = ([A-z0-9._%+-]+@[A-z0-9.-]+\.[A-z]{2,63})
FORMAT = ********@*********
DEST_KEY = _raw

Sample logs:

(29.2) 01-27-17 02:53:27 (9866:8500)  PRINTINGFN: $G_NOTIFY12_GRP_INTERNAL: abcdef.sdfrwe56@xyz.com
 (29.2) 01-27-17 02:53:27 (9866:8500)  PRINTINGFN: $G_NOTIF123Y_GRP_EXTERNAL: corP-apachesci.com

Any help would be appreciated.

0 Karma

harsmarvania57
Ultra Champion

Hi,

You can use SEDCMD in props.conf to achieve this easily.

Please try below config in props.conf on Heavy Forwarder/Indexer whichever comes first from Universal Forwarder.

[yoursourcetype]
SEDCMD-mailidmask = s/^(\N+[\:]\s)[^\@]+\@[^\n]+/\1XXXX/
0 Karma

iamlearner123
Explorer

Thank you for the reply. however, it is not masking the all the email ID's. it masked only the first email ID (abcdef.sdfrwe56@xyz.com) when i tested in regex101.

0 Karma

harsmarvania57
Ultra Champion

I can't see any other email ID in raw data which you have provided.

0 Karma

iamlearner123
Explorer

Like, i was testing the regex for the couple of other events. Please find the below events. When i put the below events in the regex101, masking is not working. I am trying achieve a dynamic regex that will work for any email ID.

(14.2) 01-27-19 02:53:28 (8544:8500) PRINTFN: $G_NOTIFY_GRP_INTERNAL: harry.peter07@abc.com
(14.2) 01-27-18 02:53:27 (8544:8500) PRINTFN: $G_NOTIFY_GRP_EXTERNAL: SAAS-Learning_MDA@trsq.com

Thanks

0 Karma

harsmarvania57
Ultra Champion

Here you go https://regex101.com/r/yHxyYg/1 , it is working fine

0 Karma

ddrillic
Ultra Champion

If we look at Anonymizing Data in Splunk

You see the following -

In this approach, a TRANSFORMS statement is called in the props.conf file and is applied to the data in the queues before being indexed. In the example, the goal is to mask the “sensitive number" except for the last 4 digits.

—props.conf---
[hr_app]
TRANSFORMS-hr_app_logs_mask_data = mask_sn

—transforms.conf---
[mask_sn]
REGEX = (?m)^(.)SN=\d{3}-\d{2}-(\d{4}.)
DEST_KEY = _raw
FORMAT = $1SN=###-##-$2

This is the result of the sample event going through the transformation
“This is an event with a sensitive number in it. SN=###-##-1111. This should be masked”

The approach here is to match the first part of the event (.*), then the part to be masked (SN=…), then the last 4 digits and the rest of the event. These last two parts are to be retained when the event data is written back out to the "_raw" field specified by the "DEST_KEY." Note that the “FORMAT” setting specifies how the event will be re-written. The "$1" and "$2" refer to the two capturing groups in the "REGEX" field.

Meaning, the REGEX captures the entire event, breaks it up to multiple capturing groups and then reconstructs the event.

0 Karma

pruthvikrishnap
Contributor

try rechecking the regex and also try excluding the FORMAT and see if that works

0 Karma

iamlearner123
Explorer

Regex is working fine but splunk is replacing the entire event with the **********************

Thanks

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...