Splunk Search

Regex - match all ocurances of a character

kenchisho
Path Finder

Hi guys,

I have been playing around trying to match multiple ocurances of a pattern and replace it with a regex in transforms.conf.

sample data
Kenan-xMuharemagic-x-xkenan@neseco.ba-x-x

I am trying to match the pattern "-x" and replace it with "_". It works perfect for one match but for the rest of the matches in the event it doesnt. This is not a multiline event and the characters I wish to replace may appear multiple times even in a single word.

Desired Result
Kenan_ Muharemagic_ _ kenan@neseco.ba_ _ (excuse the " " at _)

I was able to do this earlier with the max_matches parameter during search time. This time I am trying to replace these characters before indexing, like when anonimyzing CC numbers or so...

Is there a way to tell splunk to replace all matches of a pattern in transforms.conf?

Tags (1)
0 Karma

kenchisho
Path Finder

I'v already tried that. Been playing with this all day.

The case is that I am indexing a binary encoded log file... Splunk indexes all ASCII characters without a problem... but there are a few non-ASCII characters that are indexed as "\x6\xD1" which would be "Đ".

I've tried modifying the CHARSET but the only one that works is CP852, which is sadly not supported by Splunk.

As for SED I have not been able to match the pattern with a sed regex within Splunk, however when using standalone Regex tools or OSX CLI sed I match and replace the patterns without problems...

I have managed to work this out using transforms.conf with a regex and then applying that in props multiple times (ex. 10 times for a possibe 10 repetitions of the same character in 1 event). This is a very ugly workaround and I will try to find another way.

Example:

raw data: \x6\xD1\x6\xD1\x6\xD1\x6\xD1\x6\xD1

transforms.conf

[bin2text]
REGEX = (?)(.*)\x6\\xD1(.*)
FORMAT = $1Đ$2

props.conf
[sourcetype]
TRANSFORM-test = bin2text, bin2text, bin2text, bin2text, bin2text

result data: ĐĐĐĐĐ

As I said a very ugly solution but the only one I got working. I'm open to suggestions if someone has an idea...

0 Karma

jonuwz
Influencer

anonymizing data

look at the SEDCMD section, and the "global" /g flag

0 Karma
Get Updates on the Splunk Community!

A Season of Skills: New Splunk Courses to Light Up Your Learning Journey

There’s something special about this time of year—maybe it’s the glow of the holidays, maybe it’s the ...

Announcing the Migration of the Splunk Add-on for Microsoft Azure Inputs to ...

Announcing the Migration of the Splunk Add-on for Microsoft Azure Inputs to Officially Supported Splunk ...

Splunk Observability for AI

Don’t miss out on an exciting Tech Talk on Splunk Observability for AI! Discover how Splunk’s agentic AI ...