I'm trying to mask a field value for a policy number
that is present in my raw logs under different patterns. To explain I'm using a field extraction
:
EXTRACT-policyNumber = policy.*(-|=)\s(?P\w+)
This extracts the policyNumber value for any word that follows a string in my logs that has the word policy
and anything characters after but has either an =
sign or -
sign followed by a space
before the policyNumber value
.
I'm trying to add a line in my props.conf to mask any of these values with X
, help appreciated. Here's what I've tried so far :
SEDCMD-policyNumber_mask = s/policy.*(-|=)\s(\w+)/policy.*(-|=)\s\"XXXXXXX/g
Keep in mind that this overwrites the raw data so you forever lose the policy number and your Extract
will be XXXXXX
:
SEDCMD-policyNumber_mask = s/(policy|bankAccountNumber"?[^-=:]+[-=:]\s*"?)\w+/\1XXXXXXX/g
Keep in mind that this overwrites the raw data so you forever lose the policy number and your Extract
will be XXXXXX
:
SEDCMD-policyNumber_mask = s/(policy|bankAccountNumber"?[^-=:]+[-=:]\s*"?)\w+/\1XXXXXXX/g
I have another format in my log that I'm trying to mask but I've tried several combinations and not having luck with this format.
"bankAcctType":"Saving","bankRoutingNumber":"55522244","bankAccountNumber":"11133344444","accountHolderName":"John","AccountLastName"Doe"","signature":null,"additionalComments":""}}
I've tried applying these props.conf to the data to mask, help appreciated :
[test:log]
CHARSET = UTF-8
LINE_BREAKER = ([\r\n]+)\d{2,4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\,\d{1,3}\s\w+\s+[
MAX_TIMESTAMP_LOOKAHEAD = 25
TIME_FORMAT = %Y-%m-%d %H:%M:%S,%3N
TIME_PREFIX = ^
category = Splunk App Add-on Builder
pulldown_type = 1
KV_MODE = none
NO_BINARY_CHECK = true
disabled = false
BREAK_ONLY_BEFORE_DATE =
DATETIME_CONFIG =
SHOULD_LINEMERGE = false
SEDCMD-anon = s/(bankAcctType\":\")(\w+/)/XXXXXXXXX\2/g s/(bankRoutingNumber\":\")(\d+)/XXXXXXXXX\2/g s/(bankAccountNumber\":\")(\d+)/XXXXXXXXXXX\2/g
Of course it doesn't work; there is no policy
string! Try my updated answer.
So there are several different formats where the policy number shows throughout this log. I have a transforms.conf to filter for this particular format and send it to a different sourcetype :
transforms.conf
[test_sourcetype_CSA]
REGEX = \sservice\.CSAServiceImpl\s\(CSAServiceImpl\.
FORMAT = sourcetype::test:CSA
DEST_KEY = MetaData:Sourcetype
Sample log :
2019-12-03 15:17:32,57 DEBUG [ajp-/0.0.0.0:8209-16] service.CSAServiceImpl (CSAServiceImpl.java:89) []
- CSA Request object in debug: { "policyNumber":"2L77755540","bankAcctType":"Checking","bankRoutingNumber":"222111333","bankAccountNumber":"22222444888"}}
Then I was trying to apply this under the
[test:CSA]
SEDCMD-anon = s/\"policyNumber\":\"(\w+)/policyNumber"XXXXXXXXXXXXXXX/g s/bankRoutingNumber\":\"(\d+)/bankRoutingNumber\":\"XXXXXXXXX/g s/bankAccountNumber\":\"(\d+)/bankAccountNumber\":\"XXXXXXXXXXX/g
Again, try my updated answer; it should accommodate all variations.
Thank you, that example didn't seem to work for me but this did :
SEDCMD-anon = s/(bankRoutingNumber|bankAccountNumber)\"(:)+\"(\w+)/\1":"XXXXXXXXXXX/g
Thanks @woodcock ! It's masking every policyNumber exception logs with this format :
2019-11-25 07:51:39,659 INFO [ajp-/0.0.0.0:8209-17] security.SAMLAuthSuccessHandler (SAMLAuthSuccessHandler.java:104) []
- policyNumbers (filtered) for the user are ----------------------- 3T00005555
So it works, right?
It is for most events except the log from my last update, any thoughts on why it didnt apply to this one
@woodcock I added a 2nd capture group and now it's masking all my policyNumbers. Thank you!
Here's what I used in case this helps someone else :
SEDCMD-policyNumber_mask = s/(policy[^-=]+[-=]\s+)\w+/\1XXXXXXXXXXXXXXX/g s/(policyNumbers[^-]+\s+)\w+/\2XXXXXXXXXXXXXXX/g