Getting Data In

Configure different index time transformations for different outputs | Heavy Forwarder

GaetanVP
Contributor

Hello Splunker,

I'm currently working on a new use case and need some helps 

I'm working on a HF receiving Microsoft Cloud Logs (with https://docs.splunk.com/Documentation/AddOns/released/MSCloudServices) and I would like to forwards those logs to two differents TCP output (Splunk indexers), one with some fields anonymized, and the other without any index time transformation.

Here is a schema to help you understand my problem :

GaetanVP_0-1656334535271.png


My thoughts :
I currently have a inputs.conf configured on my HF to receive the logs from MS Cloud (with sourcetype set to mscs:azure:eventhub, I think it's compulsory to keep this sourcetype)
Then I created props.conf & transforms.conf but should I put two TRANSFORMS-<class> in order to have two differents transforms depending on the destination ?

My props.conf :
[mscs:azure:eventhub]
TRANSFORMS-anonymize = user-anonymizer

My transforms.conf :
[user-anonymizer]
REGEX = ^(.*?)"\[{\\"UserName\\":[^,]*(.*)
FORMAT = $1"###"$2
DEST_KEY = _raw


Thanks a lot,
Gaétan

0 Karma
1 Solution

PickleRick
SplunkTrust
SplunkTrust

You could use CLONE_SOURCETYPE to do a "copy" of your event. It would have to work something like that.

1. Your input provides splunk with an event of a sourcetype - let's say - microsoft:cloud

2. You do a CLONE_SOURCETYPE to a temporary:sourcetype

3a. The microsoft:cloud event goes through all the normal ingest steps and you route it to output1 (or simply don't touch anything if it's your default output

3b. The temporary:sourcetype gets reinserted into the queue, passes all appropriate transforms and at the end is routed to output2 and you rewrite the sourcetype field back to microsoft:cloud.

View solution in original post

PickleRick
SplunkTrust
SplunkTrust

You could use CLONE_SOURCETYPE to do a "copy" of your event. It would have to work something like that.

1. Your input provides splunk with an event of a sourcetype - let's say - microsoft:cloud

2. You do a CLONE_SOURCETYPE to a temporary:sourcetype

3a. The microsoft:cloud event goes through all the normal ingest steps and you route it to output1 (or simply don't touch anything if it's your default output

3b. The temporary:sourcetype gets reinserted into the queue, passes all appropriate transforms and at the end is routed to output2 and you rewrite the sourcetype field back to microsoft:cloud.

GaetanVP
Contributor

Hello PickleRick, thanks for the answer!

I followed your instructions and it does the job! 
Thanks again

0 Karma

Azeemering
Builder

Why don't you just anonymize data on index time using SEDCMD?

Anonymize data - Splunk Documentation

Create a an anon app on the indexer that you want the data anonymized and put in the props.conf

in props.conf

[mscs:azure:eventhub]
SEDCMD-user_anon =  ^(.*?)"\[{\\"UserName\\":[^,]*(.*)  

GaetanVP
Contributor

Hello Azeemering,  thanks for your answer!

The thing is I need to be sure that the events that leave the HF are already anonymized for compliance reason. And I don't have access to the indexer pool n°2.

Regarding SEDCMD or regular expression is equivalent if I'm not mistaken.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Calling All Security Pros: Ready to Race Through Boston?

Hey Splunkers, .conf25 is heading to Boston and we’re kicking things off with something bold, competitive, and ...

Beyond Detection: How Splunk and Cisco Integrated Security Platforms Transform ...

Financial services organizations face an impossible equation: maintain 99.9% uptime for mission-critical ...

Customer success is front and center at .conf25

Hi Splunkers, If you are not able to be at .conf25 in person, you can still learn about all the latest news ...