Getting Data In

Removing all white spaces from event at Index time

Tim_1
Path Finder

Hi all,

I want to remove the whitespaces from only the account value, and not the whole event at index time. Is this possible?

Given the events look like this:

{"account": "Account", "justification": "TEST 1", "value": "50"}

{"account": "dev 1", "justification": "TEST 2", "value": "50"}

{"account": "uat test acc", "justification": "TEST 3", "value": "50"}

{"account": "a .. x .. y .. z .. etc", "justification": "TEST 4", "value": "50"}

I want it to look like this:

{"account": "Account", "justification": "TEST 1", "value": "50"}

{"account": "dev1", "justification": "TEST 2", "value": "50"}

{"account": "uattestacc", "justification": "TEST 3", "value": "50"}

{"account": "axyzetc", "justification": "TEST 4", "value": "50"}
0 Karma
1 Solution

cpetterborg
SplunkTrust
SplunkTrust

The following is assuming that you really have data that looks like 1 .. n in your data stream, rather than something like 1 2 3 4 5 6 7 8 9 0. If you have only things like the latter, then it will be a simpler regex, but this one will work either way.

You could probably do something like this in props.conf:

SEDCMD-pass1 = s/Account ([^"\s]+)(\s([^"\s]+))?(\s([^"\s]+))?(\s([^"\s]+))?(\s([^"\s]+))?/\1\2\4\6\8/

This will remove up to 4 spaces. If you need to do more, then add a second pass, or third pass:

SEDCMD-pass2 = s/Account ([^"\s]+)(\s([^"\s]+))?(\s([^"\s]+))?(\s([^"\s]+))?(\s([^"\s]+))?/\1\2\4\6\8/

I haven't completely tested this, but I believe it to be fairly correct. If your event data differs much from this example, then it could make things more difficult.

View solution in original post

lfedak_splunk
Splunk Employee
Splunk Employee

Hey @Tim_1 if they solved your problem, please don't forget to accept an answer! You can upvote posts as well. (Karma points will be awarded for either action.) Happy Splunking!

0 Karma

Tim_1
Path Finder

Hi @Ifedak, will do so once I found a solution. Thanks 🙂

0 Karma

DalJeanis
Legend

I assume that you mean you want to eliminate all spaces, or all white space, from the account field at index time?

Try something like this in transforms.conf

[stanzaname]
SOURCE_KEY = account
REGEX = ^([^\s]+)(\s+)*([^\s]*)(\s+)*([^\s]*)(\s+)*([^\s]*)(\s+)*([^\s]*)(\s+)*(.*)$
DEST_KEY = account
FORMAT = $1$3$5$7$9$11

You can repeat this phrase ([^\s]*)(\s+)* once for each number of spaces you want to eliminate, and add one more odd number to the FORMAT. Not sure how many is the highest possible number.

0 Karma

Tim_1
Path Finder

Hi @DalJeanis,

Thanks for the answer. Is there a way to do it without having to change it depending on the number of spaces? I would prefer not to have to create multiple stanza for each different number of n spaces.

Also, my question wasn't 100% clear on the data I want to reformat. I've updated the question to be more inline of what the data should be.

0 Karma

cpetterborg
SplunkTrust
SplunkTrust

The following is assuming that you really have data that looks like 1 .. n in your data stream, rather than something like 1 2 3 4 5 6 7 8 9 0. If you have only things like the latter, then it will be a simpler regex, but this one will work either way.

You could probably do something like this in props.conf:

SEDCMD-pass1 = s/Account ([^"\s]+)(\s([^"\s]+))?(\s([^"\s]+))?(\s([^"\s]+))?(\s([^"\s]+))?/\1\2\4\6\8/

This will remove up to 4 spaces. If you need to do more, then add a second pass, or third pass:

SEDCMD-pass2 = s/Account ([^"\s]+)(\s([^"\s]+))?(\s([^"\s]+))?(\s([^"\s]+))?(\s([^"\s]+))?/\1\2\4\6\8/

I haven't completely tested this, but I believe it to be fairly correct. If your event data differs much from this example, then it could make things more difficult.

Tim_1
Path Finder

Hi @cpetterborg,

Thanks for the answer. My question wasn't 100% clear with the examples, so I've updated the question to be more inline of what the data should be.

The data won't be integers, but strings.

0 Karma

cpetterborg
SplunkTrust
SplunkTrust

This should still work with strings of multiple characters.

Tim_1
Path Finder

Yes, got it half working so far.
Thanks for the help. 🙂
Will accept when fully complete.

sbbadri
Motivator

try this,

| makeresults | eval test="{\"account\": \"Account 1 2\", \"justification\": \"TEST 1\", \"value\": \"50\"}" | rex field=test "(?P<t1>{\"account\":\s+)(?P<t2>\"Account\s+\S+.*\")(?P<t3>\,\s+\"justification\":\s+\"TEST\s+\d+\"\,\s+\"value\":\s+\"\d+\"})" | rex field=t2 mode=sed "s/ //g" | eval t4=t1+t2 | eval t5=t4+t3 | rename t5 as test

0 Karma

Tim_1
Path Finder

Hi @sbbadri,

Thanks for the answer, but I am looking at doing this at index time and not at search time.

0 Karma
Get Updates on the Splunk Community!

Optimize Cloud Monitoring

  TECH TALKS Optimize Cloud Monitoring Tuesday, August 13, 2024  |  11:00AM–12:00PM PST   Register to ...

What's New in Splunk Cloud Platform 9.2.2403?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.2.2403! Analysts can ...

Stay Connected: Your Guide to July and August Tech Talks, Office Hours, and Webinars!

Dive into our sizzling summer lineup for July and August Community Office Hours and Tech Talks. Scroll down to ...