Getting Data In
Highlighted

After deploying an app with a sedcmd stanza in props.conf, why is my data not being anonymized?

Path Finder

Hi,

I want to anonymize sessionid information from weblogs =.

I use a deployment server to push out an app with the log files we are tailing.

In that app, I have a props.conf with the following line:

[web_access]
SEDCMD-access = s/(?:\s\d+\s)(\w{32})/ XXXXXXXXXX-sessionid-XXXXXXXXXXX /g

web_access is the sourcetype of the log being tailed that contains the session id.

The session id (char length 32) is always preceded by an integer surrounded by white space.

I came to the regex above by tweaking the results of a search with rex mode=sed "s/(?:\s\d+\s)(\w{32})/XXXXXX-sessionid-XXXXXX/g. This consistently masks the sessionid in searches on historical data.

I have deployed the the app out with its new sedcmd stanza in props.conf, but new data doesn't seem to be getting anonymized, even though I have restarted the universal forwarder on the web server (Windows, but not iis)

Any ideas?

0 Karma
Highlighted

Re: After deploying an app with a sedcmd stanza in props.conf, why is my data not being anonymized?

Path Finder

after some tweaking it appears the props.conf must exist on the indexer rather than the universal forwarder.

I had a copy of props.conf on the indexer, i removed it and the masking broke.

edit:

After some monitoring and further tweaking, this is the final sedcmd we are using:

[web_access]
SEDCMD-access = s/(\s\d+\s)(\w{16})(\w{16})/ \1 XXX-sessionid-XX\3 /g

The inital sedcmd also took out the integer surrounded by white space (the response size), so this puts it back. And we also only mask half of the sessionid. As after consultation with our devs and infosec it was determined that having half the session id meant they could still use it to trouble shoot user flows, and because we had masked half of it it means the user session cannot be stolen for a replay attack. half of the sessionid should still be unique over short intervals that we would need to trouble shoot user flows.

The replacement anonymized sessionid is also 32chars, so it maintains formatting in the log (if that was ever a problem anyway)

View solution in original post

0 Karma
Highlighted

Re: After deploying an app with a sedcmd stanza in props.conf, why is my data not being anonymized?

Influencer

Glad you figured out your own question! If possibl;e, could you accept your own answer to mark the question completed?

0 Karma
Highlighted

Re: After deploying an app with a sedcmd stanza in props.conf, why is my data not being anonymized?

Splunk Employee
Splunk Employee

For posterity, I will add this link, which explains which settings take effect on which components of a deployment; in the hopes it is helpful.

0 Karma
Highlighted

Re: After deploying an app with a sedcmd stanza in props.conf, why is my data not being anonymized?

Path Finder

yes, this is very useful thank you.

I have decided to retain the changes in the app to include the props.conf but to have a comment that the stanza needs to exist on the indexer. just in case the props.conf on the indexers i lost or overwritten, the app config will tell whoever how to get it working again.