Hi Splunkers,
I am looking for some help in creation of regular expression to Anonymize data with a regular expression in a transforms.
Link: https://docs.splunk.com/Documentation/SplunkCloud/6.6.3/Data/Anonymizedata
Current Log format: Timestamp | Category | Machine | ApplicationDomain | ProcessId | ProcessName | ThreadId | LogID | UserName | ActionName | Module | AuthorizationStatus | RequestedBy | RequestingURL | QueryString | HTTPVerb | ClientIP| LogEvent="Response",MethodName="get.complete",ActionResult="Success",ApplicationNumber="1234567890",ApplicationLanguage="1",Section="SUMMARY",FirstName="Shrelock",LastName="Holmes",Gender="M",DateOfBirth="7/19/1976",SocialSecurityNumber="123456789",MaritalStatus="0",RaceInformation="Item8",CitizenshipCode="1",County="20",AddressLine1="221 Baker Street",City="Marylebone",State="London"
I want to write a regular expression to mask all key value pairs which start after ‘,MethodName="get.complete",’ (i.e ApplicationNumber, FirstName, DateOfBirth, SocialSecurityNumber, MaritalStatus ,etc)
Order of the field till Method name is constant and is never changing. Every event would have exact order till “MethodName” and additional PII elements added after the “MethodName”.
Unchanged order of the field Example:
Timestamp | Category | Machine | ApplicationDomain | ProcessId | ProcessName | ThreadId | LogID | UserName | ActionName | Module | AuthorizationStatus | RequestedBy | RequestingURL | QueryString | HTTPVerb | ClientIP| LogEvent="Response",MethodName="get.complete",
Note: The location of the fields to masked may change at time but it will always be in a key value pair format. (i.e ,ApplicationNumber="1234567890",ApplicationLanguage="1",Section="SUMMARY",FirstName="Sherlock",LastName="Holmes",Gender="M",DateOfBirth="7/19/1976")
Following are the solution I was planning to use to mask data at index time.
PROPS Example Using SEDCMD Regex:
[sourcetype]
**SEDCMD-mask = regex to skip first three key-value pair and mask rest
OR**
Transforms Example Using regex:
[ssn-anonymizer]
REGEX = regex to capture ssn
FORMAT = format to mask entire data
DEST_KEY = _raw
Thank you for all of your help and advice.
... View more