Getting Data In

Using transforms to replace _raw data vs SEDCMD

Runals
Motivator

I have a group that has Windows object access auditing turned on for the wrong things which is generating a ton of events. Instead of simply dropping those events to the floor I'd like to bring them in BUT replace basically 100% of the log with a 'place holder' event. The idea is they (and I) can use Splunk to see when the audit config has been updated. I initially used the SEDCMD in props and while that works like a champ the regex replacement function on my indexers is going bonkers. I'd like to transition this to an actual transforms stanza but am not having much success in something that otherwise should be pretty straight forward.

The working SEDCMD in the props is as follows

[WinEventLog:Security]
SEDCMD-4658_Exchange_store = s/(?ims)(.*EventCode=4658.*Exchange\\Bin\\store\.exe)/Placeholder event for EventCode=4658 where Process_Name=E:\\Exchange\\Bin\\store.exe/g

Here is what I've tried with the transforms route

Props

[WinEventLog:Security]
TRANSFORMS-winsec_events_replace = replace4658_exchange_store

Transforms

[replace4658_exchange_store]
REGEX = (?ms).*EventCode=4658.*Exchange\\Bin\\store\.exe
DEST_KEY = _raw
FORMAT = Placeholder

What I want is more text than 'Placeholder' but at this point I'm just trying to get something working.

Tags (2)
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

Your regexprocessor is going bonkers because your expression starts with .*. That matches everything and creates enormous backtracking effort.

You can drop those two chars from the front of your REGEX in transforms.conf for no logical difference and should see your load go right down. That won't work with SEDCMD because you're capturing those chars there.

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

Your regexprocessor is going bonkers because your expression starts with .*. That matches everything and creates enormous backtracking effort.

You can drop those two chars from the front of your REGEX in transforms.conf for no logical difference and should see your load go right down. That won't work with SEDCMD because you're capturing those chars there.

yannK
Splunk Employee
Splunk Employee

I really think that index time transforms cannot be reloaded, because the indexing pipeline is already running. So the restart is the way to apply new settings.

PS : search time transforms can be reloaded.

martin_mueller
SplunkTrust
SplunkTrust

Does that reload index-time configuration as well?

You can hit http://hostname:port/debug/refresh to do a lot of reloading as well, but no index-time config such as SEDCMD or _raw rewrites through transforms.conf.

0 Karma

Runals
Motivator

Thanks for fixing the image though that thing is a monster lol. For 6x run the following to reload ya props | rest /servicesNS/-/-/configs/conf-props/_reload followed by a | rest /servicesNS/-/-/admin/monitor/_reload. That wasn't my discovery but its a nifty little trick 😃

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Index-time configuration can indeed not be reloaded, those require a restart.

I've fixed your image a bit, .tiff appears to break.

0 Karma

Runals
Motivator

In 6.0.2 looks like you can't use a REST command to reload your transforms. It took restarting my indexers to get the transforms replacement to work. The impact is pretty dramatic really as you look at Splunk processor activity from before and after the change (though I can't get the picture to display). The other piece as triest pointed out in the REGEX I posed I wasn't capturing anything which appears to be a must in 6x even if you don't end up using it.

alt text

The SEDCMD was running against our Windows security logs which, like many, constitute a fairly sizable chunk of volume. I ended up using the following which drops the avg byte count from 632 to 70.

Props

[WinEventLog:Security]
TRANSFORMS-winsec_events_shorten = replace4658
EXTRACT-winsec_4658_custom_fields = ^Trimmed Event EventCode=(?<EventCode>\S+) Handle_ID=(?<Handle_ID>\S+) Process_ID=(?<Process_ID>\S+)

Transforms

[replace4658]
REGEX = (?ms)EventCode=(4658).*?Handle ID:\s+(\S+).*?Process ID:\s+(\S+)
DEST_KEY = _raw
FORMAT = Trimmed Event EventCode=$1 Handle_ID=$2 Process_ID=$3

martin_mueller
SplunkTrust
SplunkTrust

Got a before/after comparison of regexprocessor loads in SoS's indexing performance view?

You might see tiny further gains by making the middle .* non-greedy like this:

REGEX = (?ms)EventCode=4658.*?Exchange\\Bin\\store\.exe

I don't expect nowhere near as much change though compared to removing the leading .*.

0 Karma

Runals
Motivator

Good call! Thanks for that additional piece. I had copied over a section of regex where we were stripping out some of the crap Windows message text which calls for capturing the first 2/3 of the event.

0 Karma

triest
Communicator

I believe you tried this, so I doubt its the only issue, but wouldn't you need to capture something in your REGEX even if you don't use it?

0 Karma

Runals
Motivator

Sed is disabled at this point. In the interest of full disclosure I'm trying to be sneaky and reload the props and transforms with a REST command so I don't have to restart my indexers. This works for props - haven't fully tested it for transforms.

0 Karma

yannK
Splunk Employee
Splunk Employee

if you have a sample of the events, it will help to test.
At first sight, the transforms looks correct. Have you restarted the indexer to apply ? Are you using both sed and transforms currently ?

0 Karma
Get Updates on the Splunk Community!

Good Sourcetype Naming

When it comes to getting data in, one of the earliest decisions made is what to use as a sourcetype. Often, ...

See your relevant APM services, dashboards, and alerts in one place with the updated ...

As a Splunk Observability user, you have a lot of data you have to manage, prioritize, and troubleshoot on a ...

Splunk App for Anomaly Detection End of Life Announcement

Q: What is happening to the Splunk App for Anomaly Detection?A: Splunk is officially announcing the ...