Splunk Search

Regex - match all ocurances of a character

kenchisho
Path Finder

Hi guys,

I have been playing around trying to match multiple ocurances of a pattern and replace it with a regex in transforms.conf.

sample data
Kenan-xMuharemagic-x-xkenan@neseco.ba-x-x

I am trying to match the pattern "-x" and replace it with "_". It works perfect for one match but for the rest of the matches in the event it doesnt. This is not a multiline event and the characters I wish to replace may appear multiple times even in a single word.

Desired Result
Kenan_ Muharemagic_ _ kenan@neseco.ba_ _ (excuse the " " at _)

I was able to do this earlier with the max_matches parameter during search time. This time I am trying to replace these characters before indexing, like when anonimyzing CC numbers or so...

Is there a way to tell splunk to replace all matches of a pattern in transforms.conf?

Tags (1)
0 Karma

kenchisho
Path Finder

I'v already tried that. Been playing with this all day.

The case is that I am indexing a binary encoded log file... Splunk indexes all ASCII characters without a problem... but there are a few non-ASCII characters that are indexed as "\x6\xD1" which would be "Đ".

I've tried modifying the CHARSET but the only one that works is CP852, which is sadly not supported by Splunk.

As for SED I have not been able to match the pattern with a sed regex within Splunk, however when using standalone Regex tools or OSX CLI sed I match and replace the patterns without problems...

I have managed to work this out using transforms.conf with a regex and then applying that in props multiple times (ex. 10 times for a possibe 10 repetitions of the same character in 1 event). This is a very ugly workaround and I will try to find another way.

Example:

raw data: \x6\xD1\x6\xD1\x6\xD1\x6\xD1\x6\xD1

transforms.conf

[bin2text]
REGEX = (?)(.*)\x6\\xD1(.*)
FORMAT = $1Đ$2

props.conf
[sourcetype]
TRANSFORM-test = bin2text, bin2text, bin2text, bin2text, bin2text

result data: ĐĐĐĐĐ

As I said a very ugly solution but the only one I got working. I'm open to suggestions if someone has an idea...

0 Karma

jonuwz
Influencer

anonymizing data

look at the SEDCMD section, and the "global" /g flag

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...