I'm using this regex to mask cc data in props.cof on a Heavy Forwarder....need help in validating....
log format
all numbers starts with account=" some of them ends with " and some are not..
xxxxxxxxxxxxxxxxxxx/xxxxx
xxxxxxxxxxxxxxxx xx/xx xxx
xxxxxxxxxxxxxxxx: xx xxxxx
"xxxxxxxxxxxxxxxx "
xxxxxxxxxxxxxxxxx
"xxxx-xxxx-xxxx-xxxx"
"0-xxxxxxxxxxxxxxx"
I see the entire event has been dropped, i don't see any event with account=xxxx-xxxx-xxxx-xxxx
[my_sourcetype]
SEDCMD-accmasking= s/account=\"?[\w\d\-\s\/\:\S]+\"?/xxxx-xxxx-xxxx-xxxx/g
I tried using the capturing groups (), still it didn't work...
[my_sourcetype]
SEDCMD-accmasking= s/account=(\"?[\w\d\-\s\/\:\S]+\"?)/xxxx-xxxx-xxxx-xxxx/g
Can you try the following?
[my_sourcetype]
SEDCMD-accmasking = s/account=(\d{4}-){3}(\d{4})/cc=xxxx-xxxx-xxxx-\2/g
If that doesn't work are you able to submit a scrubbed sample event?
Also for future use: http://docs.splunk.com/Documentation/Splunk/6.0/Data/Anonymizedatausingconfigurationfiles
Give this a try
[my_sourcetype]
SEDCMD-accmasking= s/(account=\")[^\"]+)(\")/\1xxxx-xxxx-xxxx-xxxx\3/g
Update
Try this (runanywhere sample, everything except last rex line is to generate dummy data, replace that with your search)
| gentimes start=-1 | eval t="account=\"1111111111111111111/11111 some other text#account=\"1111111111111111 11/11 111 some other text#account=\"1111111111111111: 11 11111#account=\"1111111111111111 \" some other text#account=\"11111111111111111 some other text#account=\"1111-1111-1111-1111\" some other text#account=\"0-111111111111111\" some other text" | table t | makemv t delim="#" | mvexpand t | rename t as _raw | eval orig=_raw
| rex mode=sed "s/(account=\")([\d-:\s\/]+)(\"*)(\s*\w)/\1xxxx-xxxx-xxxx-xxxx\" \4/g"
that minus/hyphen character may need escaping in someoni2's rex:
| gentimes start=-1
| eval t="account=\"1111111111111111111/11111 some other text#account=\"1111111111111111 11/11 111 some other text#account=\"1111111111111111: 11 11111#account=\"1111111111111111 \" some other text#account=\"11111111111111111 some other text#account=\"1111-1111-1111-1111\" some other text#account=\"0-111111111111111\" some other text"
| table t
| makemv t delim="#"
| mvexpand t
| rename t as _raw
| eval orig=_raw
| rex mode=sed "s/(account=\")([\d\-:\s\/]+)(\"*)(\s*\w)/\1xxxx-xxxx-xxxx-xxxx\"/g"
Have added the logformat in my question if you can give me one regex based on that. Thanks..!!
@mcnamara I just tested with all of your formats and I think the following should work
[my_sourcetype]
SEDCMD-accmasking = s/account=(\d{4}-){3}(\d{4})/cc=xxxx-xxxx-xxxx-\2/g