Getting Data In

How to mask a field value from raw events that shows in multiple patterns

johnward4
Communicator

I'm trying to mask a field value for a policy number that is present in my raw logs under different patterns. To explain I'm using a field extraction:

EXTRACT-policyNumber = policy.*(-|=)\s(?P\w+)

This extracts the policyNumber value for any word that follows a string in my logs that has the word policy and anything characters after but has either an = sign or - sign followed by a space before the policyNumber value.

I'm trying to add a line in my props.conf to mask any of these values with X, help appreciated. Here's what I've tried so far :

SEDCMD-policyNumber_mask = s/policy.*(-|=)\s(\w+)/policy.*(-|=)\s\"XXXXXXX/g
0 Karma
1 Solution

woodcock
Esteemed Legend

Keep in mind that this overwrites the raw data so you forever lose the policy number and your Extract will be XXXXXX:

SEDCMD-policyNumber_mask = s/(policy|bankAccountNumber"?[^-=:]+[-=:]\s*"?)\w+/\1XXXXXXX/g

View solution in original post

woodcock
Esteemed Legend

Keep in mind that this overwrites the raw data so you forever lose the policy number and your Extract will be XXXXXX:

SEDCMD-policyNumber_mask = s/(policy|bankAccountNumber"?[^-=:]+[-=:]\s*"?)\w+/\1XXXXXXX/g

johnward4
Communicator

@woodcock

I have another format in my log that I'm trying to mask but I've tried several combinations and not having luck with this format.

"bankAcctType":"Saving","bankRoutingNumber":"55522244","bankAccountNumber":"11133344444","accountHolderName":"John","AccountLastName"Doe"","signature":null,"additionalComments":""}}

I've tried applying these props.conf to the data to mask, help appreciated :

[test:log]
CHARSET = UTF-8
LINE_BREAKER = ([\r\n]+)\d{2,4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\,\d{1,3}\s\w+\s+[
MAX_TIMESTAMP_LOOKAHEAD = 25
TIME_FORMAT = %Y-%m-%d %H:%M:%S,%3N
TIME_PREFIX = ^
category = Splunk App Add-on Builder
pulldown_type = 1
KV_MODE = none
NO_BINARY_CHECK = true
disabled = false
BREAK_ONLY_BEFORE_DATE =
DATETIME_CONFIG =
SHOULD_LINEMERGE = false
SEDCMD-anon = s/(bankAcctType\":\")(\w+/)/XXXXXXXXX\2/g s/(bankRoutingNumber\":\")(\d+)/XXXXXXXXX\2/g s/(bankAccountNumber\":\")(\d+)/XXXXXXXXXXX\2/g

0 Karma

woodcock
Esteemed Legend

Of course it doesn't work; there is no policy string! Try my updated answer.

0 Karma

johnward4
Communicator

So there are several different formats where the policy number shows throughout this log. I have a transforms.conf to filter for this particular format and send it to a different sourcetype :

transforms.conf

[test_sourcetype_CSA]
REGEX = \sservice\.CSAServiceImpl\s\(CSAServiceImpl\.
FORMAT = sourcetype::test:CSA
DEST_KEY = MetaData:Sourcetype

Sample log :
2019-12-03 15:17:32,57 DEBUG [ajp-/0.0.0.0:8209-16] service.CSAServiceImpl (CSAServiceImpl.java:89) []
- CSA Request object in debug: { "policyNumber":"2L77755540","bankAcctType":"Checking","bankRoutingNumber":"222111333","bankAccountNumber":"22222444888"}}

Then I was trying to apply this under the

[test:CSA]
SEDCMD-anon = s/\"policyNumber\":\"(\w+)/policyNumber"XXXXXXXXXXXXXXX/g s/bankRoutingNumber\":\"(\d+)/bankRoutingNumber\":\"XXXXXXXXX/g s/bankAccountNumber\":\"(\d+)/bankAccountNumber\":\"XXXXXXXXXXX/g

0 Karma

woodcock
Esteemed Legend

Again, try my updated answer; it should accommodate all variations.

0 Karma

johnward4
Communicator

Thank you, that example didn't seem to work for me but this did :

SEDCMD-anon = s/(bankRoutingNumber|bankAccountNumber)\"(:)+\"(\w+)/\1":"XXXXXXXXXXX/g

0 Karma

johnward4
Communicator

Thanks @woodcock ! It's masking every policyNumber exception logs with this format :

2019-11-25 07:51:39,659 INFO  [ajp-/0.0.0.0:8209-17] security.SAMLAuthSuccessHandler (SAMLAuthSuccessHandler.java:104)  []
                                        - policyNumbers (filtered) for the user are ----------------------- 3T00005555
0 Karma

woodcock
Esteemed Legend

So it works, right?

0 Karma

johnward4
Communicator

It is for most events except the log from my last update, any thoughts on why it didnt apply to this one

0 Karma

johnward4
Communicator

@woodcock I added a 2nd capture group and now it's masking all my policyNumbers. Thank you!

Here's what I used in case this helps someone else :

SEDCMD-policyNumber_mask = s/(policy[^-=]+[-=]\s+)\w+/\1XXXXXXXXXXXXXXX/g s/(policyNumbers[^-]+\s+)\w+/\2XXXXXXXXXXXXXXX/g

Get Updates on the Splunk Community!

Exporting Splunk Apps

Join us on Monday, October 21 at 11 am PT | 2 pm ET!With the app export functionality, app developers and ...

Cisco Use Cases, ITSI Best Practices, and More New Articles from Splunk Lantern

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Build Your First SPL2 App!

Watch the recording now!.Do you want to SPL™, too? SPL2, Splunk's next-generation data search and preparation ...