Splunk Search

Create a new field and populate it with normalized data?

dbcase
Motivator

Hi,

I have a field in my existing data set called mso. Within that field are company names

Example

CompanyA
CompanyAA1
CompanyABB1
CoB
CoBBBB1
CompC
CompC2

What I would like to do is create a new field called normalized_mso with normalized company names. so that it looks like the below

CompanyA
CoB
CompC

How could I accomplish this?

0 Karma

DalJeanis
Legend

This works against your examples.

| makeresults
| eval mso="ExxonUSA,ExxonEurope,ExxonAPAC,ShellUSA,ShellSouthAmerica,ShellNZ,ChevronCalifornia,ChevronIceland,Chevron"
| makemv mso delim="," 
| mvexpand mso
| eval mso2=mso
| rex mode=sed field=mso2 "s/(^[A-Z]+?[a-z]+)([A-Z]*?[a-z]*?[a-zA-Z]*?$)/\1/g"

You may need to post any examples you have of companies that have numbers, special characters, or all lower case for testing. This works against a few more scenarios...

 | makeresults
 | eval mso="ExxonUSA,ExxonEurope,ExxonAPAC,ShellUSA,ShellSouthAmerica,ShellNZ,ChevronCalifornia,ChevronIceland,Chevron,blahblahTheblah,Protection1Green,327weirdness Inc,Iceberg Titanic Company"
 | makemv mso delim="," 
 | mvexpand mso
 | eval mso2=mso
 | rex mode=sed field=mso2 "s/^([^a-z]*)([a-z0-9]+)([^a-z0-9\n]*)([a-zA-Z0-9 ]*)$/\1\2/g"
0 Karma

niketn
Legend

@dbcase, if you can use the upper case character in second string as the pattern to end field extraction, you can use the rex command | rex field=<YourFieldNameGoesHere> "(?<mso_new>[A-Z]{1}[^A-Z]+)".
Following is a run anywhere search for you to try out:

| makeresults
| eval mso="ExxonUSA,ExxonEurope,ExxonAPAC,ShellUSA,ShellSouthAmerica,ShellNZ,ChevronCalifornia,ChevronIceland,Chevron"
| makemv mso delim="," 
| mvexpand mso
| rex field=mso "(?<mso_new>[A-Z]{1}[^A-Z]+)"

PS: The regular expression means start with Capital letter and extract all characters before find next Upper Case character. You can test the same on regex101.com as well.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

dbcase
Motivator

Hi Niketnilay,

You are a busy man!!! Sadly the real data doesn't necessarily have an uppercase character.

0 Karma

niketn
Legend

@dbcase 😄 LOL I am on leave today, travelling to hometown 🙂

If there is no pattern your option would be to use lookup with wildcard. Refer to this answer https://answers.splunk.com/answers/52580/can-we-use-wildcard-characters-in-a-lookup-table.html

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

woodcock
Esteemed Legend

You need to give us the rules for equation. For example:
All names start with and end with a single capital letter with all-lowercase between.
Everything after the 2nd (ending) capital letter is trash.

0 Karma

dbcase
Motivator

Well the names are kinda all over the map, thats why I made the examples variable.

Right now I'm matching using "like" and while it mostly works, what I really need is a new field in the existing data set.

Here is a better example

Company names (made up)

ExxonUSA
ExxonEurope
ExxonAPAC
ShellUSA
ShellSouthAmerica
ShellNZ
ChevronCalifornia
ChevronIceland
Chevron

I'd like to end up with
Exxon
Shell
Chevron

0 Karma

woodcock
Esteemed Legend

I get the general idea but I need some rules of structure against which to code.

0 Karma
Get Updates on the Splunk Community!

OpenTelemetry for Legacy Apps? Yes, You Can!

This article is a follow-up to my previous article posted on the OpenTelemetry Blog, "Your Critical Legacy App ...

UCC Framework: Discover Developer Toolkit for Building Technology Add-ons

The Next-Gen Toolkit for Splunk Technology Add-on Development The Universal Configuration Console (UCC) ...

.conf25 Community Recap

Hello Splunkers, And just like that, .conf25 is in the books! What an incredible few days — full of learning, ...