Splunk Enterprise

how to anonymize specific variables data from JSON File

mah
Builder

Hi,

I want to mask just specific values. 

This is an example of a json event return in splunk :

{"MemorySize": 256, "region": "ca-central-1", "TracingConfig": \{"Mode": "PassThrough"\}, "RevisionId": "777", "Handler": "handleRequest", "Timeout": 600, "LastModified": "2020-05-27T14:05:43.839+0000", "Environment": \{"Variables": \{"ENVIRONMENT": "dev", "USER": "username",  "USERPASSD": "password", \}\}, "Role": "arn:aws:iam::666:role/X", "VpcConfig": \{"SubnetIds": ["subnet-000", "subnet-111"], "VpcId": "vpc-333", "SecurityGroupIds": ["sg-444"]\}, "CodeSize": 5555, "Description": "Lambda", "Runtime": "java11", "Version": "$LATEST"\}}

 

The problem is that sensitive data appear in clear specifically  in Environment>Variables

In this section, we have variables : the data are not the same in each event, we can not create a regex with specific key name because it always changes. 

How can I mask all values in the Environment>Variables WITHOUT masking the key ?

Example of result I want :

{"MemorySize": 256, "region": "ca-central-1", "TracingConfig": \{"Mode": "PassThrough"\}, "RevisionId": "777", "Handler": "handleRequest", "Timeout": 600, "LastModified": "2020-05-27T14:05:43.839+0000", "Environment": \{"Variables": \{"ENVIRONMENT": XXXXXX, "USER": XXXXXX,  "USERPASSD": XXXXXX, \}\}, "Role": "arn:aws:iam::666:role/X", "VpcConfig": \{"SubnetIds": ["subnet-000", "subnet-111"], "VpcId": "vpc-333", "SecurityGroupIds": ["sg-444"]\}, "CodeSize": 5555, "Description": "Lambda", "Runtime": "java11", "Version": "$LATEST"\}}

 

I tried a props.conf like that : 

[sourcetype]

INDEXED_EXTRACTION = json

KV_MODE = none

EXTRACT-var = \{\"Variables\"\:\s*\\\{(?<Variables>[^\}]+)\\

TRANSFORMS-anony = anony_raw

 

and a transforms.conf : 

[anony_raw]

REGEX = s/(\s*\"\s*[^\"]*\"[^\"]*\"([^\"]*)\s*\"\s*\,*)+

FORMAT = $1XXXXXX

DEST_KEY =_meta

SOURCE_KEY =_meta

 

But it doesn't work at all...

Can you help me ?

Labels (1)
0 Karma
1 Solution

rnowitzki
Builder

@mah you're welcome. I learned something myself 🙂

You might want to set my last reply as the solution, for future splunkers having similiar issues.

Cheers
Ralph

--
Karma and/or Solution tagging appreciated.

View solution in original post

0 Karma

to4kawa
Ultra Champion
0 Karma

mah
Builder

Yes, I've seen this answer and tried the solution but doesn't work at all.

props.conf :

[description]

INDEXED_EXTRACTION = json

KV_MODE = none

TRANSFORMS-anony = anony, anony_raw

TRUNCATE = 0

SHOULD_LINEMERGE = false

 

transforms.conf

[anony]

INGEST_EVAL = Variables=md5(Variables)

WRITE_META = true

 

[anony_raw]

REGEX = (?m)(\s*\"\s*[^\"]*\"[^\"]*\"([^\"]*)\s*\"\s*\,*)+

FORMAT = $1XXXXXXX"

DEST_KEY = _raw

 

But thats still not working...

Did I make a mistake somewhere ?

0 Karma

to4kawa
Ultra Champion

 

| makeresults
| eval _raw="{\"MemorySize\":256,\"region\":\"ca-central-1\",\"TracingConfig\":{\"Mode\":\"PassThrough\"},\"RevisionId\":\"777\",\"Handler\":\"handleRequest\",\"Timeout\":600,\"LastModified\":\"2020-05-27T14:05:43.839+0000\",\"Environment\":{\"Variables\":{\"ENVIRONMENT\":\"XXXXXX\",\"USER\":\"XXXXXX\",\"USERPASSD\":\"XXXXXX\"}},\"Role\":\"arn:aws:iam::666:role/X\",\"VpcConfig\":{\"SubnetIds\":[\"subnet-000\",\"subnet-111\"],\"VpcId\":\"vpc-333\",\"SecurityGroupIds\":[\"sg-444\"]},\"CodeSize\":5555,\"Description\":\"Lambda\",\"Runtime\":\"java11\",\"Version\":\"$LATEST\"}"
| spath

 

 

 

[anony]

INGEST_EVAL = Environment.Variables.ENVIRONMENT:=md5(Environment.Variables.ENVIRONMENT),Environment.Variables.USER:=md5(Environment.Variables.USER),Environment.Variables.USERPASSD:=md5(Environment.Variables.USERPASSD)
WRITE_META = true

[anony_raw]
REGEX = (?m)(.*Environment\":{)(.*?})(.*)
FORMAT = $1$3
DEST_KEY = _raw

 

0 Karma

mah
Builder

OK thanks, but I always have the same starting problem: all the key: value present in Variables {} are not the same in each event my starting question is: how I put your solution in place for the key: value which change all the time and that I cannot know in advance ?

I can not set up your solution because you wrote specific keys.

Example of new event :

{"Description":  None, "LastModified": "2019-12-05T10:58:05.308+0000", "TracingConfig": {"Mode": "PassThrough"}, "Version": "$LATEST", "CodeSize": 1909, "Handler": "handler", "RevisionId": "111", "MemorySize": 128, "Timeout": 180, "Environment": {"Variables": {"MAILING_LIST": "xxx@xxx.com", "PARAM_NAME": "toto", "NAME": "titi", "FLAG_NAME": "OK_flag", "ENVIRONMENT": "test", "SECRET_NAME": "123_cred", "REGION": "eu-east-1", "MAILING_LIST_PARAM_NAME": "/walnut/mailing_list"}}, "region": "eu-east-1", "Runtime": "python3.6"}

0 Karma

rnowitzki
Builder

Hi @mah,

I created a RegEx that might work.  I could not find a way to make it super-dynamic, but if you know the max number of key:value pairs in the data, it should work:

 

(?<="Variables"\:\s\\\{)(?>\"\w+\"\:\s\"(\S+)\"\,\s)?(?>\"\w+\"\:\s\"(\S+)\"\,\s)?(?>\"\w+\"\:\s\"(\S+)\"\,\s)?(?>\"\w+\"\:\s\"(\S+)\"\,\s)?

 

This example works if you have 4 or less Key/Value pairs. They will be assigned to group1, group2 etc.

If you expect more than 4, you have to append more of these:

(?>\"\w+\"\:\s\"(\S+)\"\,\s)?


Note: This does work with the format where the values are in quotes, in your initial post you had an example where the values were not within quotes.


Reg101 link

 

--
Karma and/or Solution tagging appreciated.
0 Karma

mah
Builder

Hi @rnowitzki , thank you for your reply. 

I tried to apply your regex but the problem is that some keys are written with several words like :

"USER PASSWORD": "12345abcd"

And your regex does not work anymore... 

0 Karma

rnowitzki
Builder

Hi @mah,

Ok, then we have to add optional space and optional second word to the regex.

(?<="Variables"\:\s\\\{)(?>\"\w+?\s?\w+\"\:\s\"(\S+)\"\,\s)?(?>\"\w+?\s?\w+\"\:\s\"(\S+)\"\,\s)?(?>\"\w+?\s?\w+\"\:\s\"(\S+)\"\,\s)?(?>\"\w+?\s?\w+\"\:\s\"(\S+)\"\,\s)?


The question marks before \s and \w+ mean: there might or might not be a space and another word after the first word....


The example shown above works with max 4 key value pairs. If you expect more, append more of these:

(?>\"\w+?\s?\w+\"\:\s\"(\S+)\"\,\s)?

 
Hope this works better.

--
Karma and/or Solution tagging appreciated.
0 Karma

mah
Builder

Hi, @rnowitzki 

It works but with this regex :

(?>\"\S+?\s?\S+\"\:\s\"(\S+)\"\,\s)?

The second point is in the first part of the regex is :

(?<="Variables"\:\s\\\{)

 but it does not close the brace symbol so the issue is that I want to apply this ONLY for Key/Value in the section Variables. 

What I want to say is for example, if I put 10 times the key/value regex, while I have a section Variables with only 2 key/value pair, it will mask the data OUTSIDE the section Variables.

I find a regex that extract this section, but I don't know how to include your solution above :

 

\{\"Variables\"\:\s*\\\{(?<Variables>[^\}]+)\\

 

Do you hav an idea ? 

0 Karma

rnowitzki
Builder

I don't get the issue on reg101 (it does not select anything outside of the "variables" section).

But I updated it,  by appending  the closing 2 brackets at the end.

https://regex101.com/r/eXmyO3/6

 

(?<="Variables"\:\s\\\{)(?>\"\S+?\s?\S+\"\:\s\"(\S+)\"\,\s)?(?>\"\S+?\s?\S+\"\:\s\"(\S+)\"\,\s)?(?>\"\S+?\s?\S+\"\:\s\"(\S+)\"\,\s)?(?>\"\S+?\s?\S+\"\:\s\"(\S+)\"\,\s)\\\}\\\}

 

Getting closer - i hope 🙂

--
Karma and/or Solution tagging appreciated.
0 Karma

mah
Builder

That great ! My case is solved ! 

With the last regex I tried to put more regex key/value that the Variables section contains, and it doesn't go outside the section. 

https://regex101.com/r/eXmyO3/7

I think I can put SEDCMD parameter into props.conf and not use a transforms.conf in addition. 

Nice job Thanks a lot ! 

0 Karma

rnowitzki
Builder

@mah you're welcome. I learned something myself 🙂

You might want to set my last reply as the solution, for future splunkers having similiar issues.

Cheers
Ralph

--
Karma and/or Solution tagging appreciated.
0 Karma

to4kawa
Ultra Champion

 

[anony_raw1]
REGEX = (?m)(.*ENVIRONMENT\":\")([^\"]+)(\".*)
FORMAT = $1$3
DEST_KEY = _raw

[anony_raw2]
REGEX = (?m)(.*USER\":\")([^\"]+)(\".*)
FORMAT = $1$3
DEST_KEY = _raw

[anony_raw3]
REGEX = (?m)(.*USERPASSD\":\")([^\"]+)(\".*)
FORMAT = $1$3
DEST_KEY = _raw

 

There should need three anonymize stanza,I guess.

0 Karma
Get Updates on the Splunk Community!

3 Ways to Make OpenTelemetry Even Better

My role as an Observability Specialist at Splunk provides me with the opportunity to work with customers of ...

What's New in Splunk Cloud Platform 9.2.2406?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.2.2406 with many ...

Enterprise Security Content Update (ESCU) | New Releases

In August, the Splunk Threat Research Team had 3 releases of new security content via the Enterprise Security ...