Getting Data In

Anonymization not working for extracted field with space in name

payl_chdhry
Explorer

We have requirement to mask data in index time. While below works to mask data in raw, it does not work for extracted field "User name". My SED is on universal forwarder (windows) and it works fine for raw data:

s/(GBW\d{8}\t)(\d{8}\s){0,1}(\w.*?)(\t)/\1\2(masked)\4/g

My props.conf:
[sourcetype]
SEDCMD-username=s/(GBW\d{8}\t)(\d{8}\s){0,1}(\w.*?)(\t)/\1\2(masked)\4/1
FIELD_DELIMITER=tab
HEADER_FIELD_DELIMITER=tab
HEADER_FIELD_LINE_NUMBER=1
MAX_TIMESTAMP_LOOKAHEAD=300
TIMESTAMP_FIELDS=Timestamp
TIME_FORMAT=%Y%m%dT%H%M%S.%3N+%z
TRANSFORMS-anonymize = username-anonymizer

However, Transforms does not work. Have tried by placing on Universal forwarder as well as Intermediate heavy forwarder. Have created based on response from Solved: How can I anonymize fields of data that has underg... - Splunk Community

transforms.conf:

[username-anonymizer]
REGEX = (?m)^(.*User name\:\:)(\d{8}\s){0,1}(\w.*?)$
FORMAT = $1(masked)
WRITE_META = false
SOURCE_KEY = _meta
DEST_KEY = _meta

 

Related info: We are expecting tab-delimited data. The field User name is in the middle and follows hostname and hence GBW is this example.

"User name" could be combination of id and name and we only want to mask name:

Value :
12345678 firstname lastname
12345678 firstname
firstname lastname
firstname

expected masked value

12345678 (masked)
12345678 (masked)
(masked)
(masked)


It could be blank as well.

 

Labels (1)
0 Karma

PaulPanther
Builder

Have you already tried below setup and checked if your regex is correct?

[username-anonymizer]
REGEX = (?m)^(.*User name\:\:)(\d{8}\s){0,1}(\w.*?)$
FORMAT = $1(masked)
WRITE_META = false
SOURCE_KEY = "field:User name"
DEST_KEY = "field:User name"

 

Feel free to provide some sample data. Thank you!

0 Karma

payl_chdhry
Explorer

Thank you for your response @PaulPanther 

I tried what you suggested but if did not work. extracted field still has unmasked data.

Below is the type of event we expect:

Product Assembly Name	Product Version	Class Name	Timestamp	Severity	Hostname	User name	User ID	WebEngine Request ID	Connection ID	Task ID	Execution ID	Report ID	Request ID	Transformation ID	Message	Exception	Stacktrace

Qlik.NPrinting.Repo	00.00.0.0	Qlik.NPrinting.Repo.Service.AuthenticationService	20221118T163152.532+00:00	INFO	GBW22223451	John Smith	0	0	0	0	0	0	0	0	Windows login successful. The user with id n45675643h456556l5c7bu5jw5esd4 has been correctly identified as a Windows domain user with sid S-1-4-10-123457890-1234543-13243554-344545

Qlik.NPrinting.Repo	21.14.5.0	Qlik.NPrinting.Repo.Service.AuthenticationService	20221118T163152.532+00:00	INFO	GBW22223451	12345678	0	0	0	0	0	0	0	0	Windows login successful. The user with id n45675643h456556l5c7bu5jw5esd4 has been correctly identified as a Windows domain user with sid S-1-4-10-123457890-1234543-13243554-344545

Qlik.NPrinting.Repo	21.14.5.0	Qlik.NPrinting.Repo.Service.ImportExport.DataConnectionsMatchingService	20221118T163152.532+00:00	WARN	GBW22223451	12345678 John Smith	0	0	0	0	0	0	0	0	Trying to import connection Horizon Scanning Rapid 2 MI Connection. Data connection NPrinting  Rapid2 does not match↓Missing objects from template: O\fghjy, O\123457890-1234543-13243554-344545, O\123457890-1234543-13243554-344545, O\ttyte, O\fgfggf, O\erewff, O\sdfdf, O\dfdgfg, O\zfgfg, O\DfAd, O\dsfdfh, O\dfdfD, O\dfdZ↓

Qlik.NPrinting.WebEngine	21.14.5.0	Qlik.NPrinting.WebEngine.WebEngineWindowsService	20221118T163152.532+00:00	INFO	GBW22223451		0	0	0	0	0	0	0	0	Windows authentication server listening on http://localhost:port/	

 

0 Karma

PaulPanther
Builder

Have you already tried the other suggested solution from @jeffland ?

[username-anonymizer]
REGEX = .+
FORMAT = "User name::masked"
WRITE_META = true
SOURCE_KEY = "field:User name"
DEST_KEY = "field:User name"
[accepted_keys]
is_valid="field:User name"

 

If not please try to get rid off the spaces first with 

props.conf
[sourcetype]
SEDCMD-replacespace = s/ /_/g
0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...