Splunk Search

Create a new field and populate it with normalized data?

dbcase
Motivator

Hi,

I have a field in my existing data set called mso. Within that field are company names

Example

CompanyA
CompanyAA1
CompanyABB1
CoB
CoBBBB1
CompC
CompC2

What I would like to do is create a new field called normalized_mso with normalized company names. so that it looks like the below

CompanyA
CoB
CompC

How could I accomplish this?

0 Karma

DalJeanis
Legend

This works against your examples.

| makeresults
| eval mso="ExxonUSA,ExxonEurope,ExxonAPAC,ShellUSA,ShellSouthAmerica,ShellNZ,ChevronCalifornia,ChevronIceland,Chevron"
| makemv mso delim="," 
| mvexpand mso
| eval mso2=mso
| rex mode=sed field=mso2 "s/(^[A-Z]+?[a-z]+)([A-Z]*?[a-z]*?[a-zA-Z]*?$)/\1/g"

You may need to post any examples you have of companies that have numbers, special characters, or all lower case for testing. This works against a few more scenarios...

 | makeresults
 | eval mso="ExxonUSA,ExxonEurope,ExxonAPAC,ShellUSA,ShellSouthAmerica,ShellNZ,ChevronCalifornia,ChevronIceland,Chevron,blahblahTheblah,Protection1Green,327weirdness Inc,Iceberg Titanic Company"
 | makemv mso delim="," 
 | mvexpand mso
 | eval mso2=mso
 | rex mode=sed field=mso2 "s/^([^a-z]*)([a-z0-9]+)([^a-z0-9\n]*)([a-zA-Z0-9 ]*)$/\1\2/g"
0 Karma

niketn
Legend

@dbcase, if you can use the upper case character in second string as the pattern to end field extraction, you can use the rex command | rex field=<YourFieldNameGoesHere> "(?<mso_new>[A-Z]{1}[^A-Z]+)".
Following is a run anywhere search for you to try out:

| makeresults
| eval mso="ExxonUSA,ExxonEurope,ExxonAPAC,ShellUSA,ShellSouthAmerica,ShellNZ,ChevronCalifornia,ChevronIceland,Chevron"
| makemv mso delim="," 
| mvexpand mso
| rex field=mso "(?<mso_new>[A-Z]{1}[^A-Z]+)"

PS: The regular expression means start with Capital letter and extract all characters before find next Upper Case character. You can test the same on regex101.com as well.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

dbcase
Motivator

Hi Niketnilay,

You are a busy man!!! Sadly the real data doesn't necessarily have an uppercase character.

0 Karma

niketn
Legend

@dbcase 😄 LOL I am on leave today, travelling to hometown 🙂

If there is no pattern your option would be to use lookup with wildcard. Refer to this answer https://answers.splunk.com/answers/52580/can-we-use-wildcard-characters-in-a-lookup-table.html

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

woodcock
Esteemed Legend

You need to give us the rules for equation. For example:
All names start with and end with a single capital letter with all-lowercase between.
Everything after the 2nd (ending) capital letter is trash.

0 Karma

dbcase
Motivator

Well the names are kinda all over the map, thats why I made the examples variable.

Right now I'm matching using "like" and while it mostly works, what I really need is a new field in the existing data set.

Here is a better example

Company names (made up)

ExxonUSA
ExxonEurope
ExxonAPAC
ShellUSA
ShellSouthAmerica
ShellNZ
ChevronCalifornia
ChevronIceland
Chevron

I'd like to end up with
Exxon
Shell
Chevron

0 Karma

woodcock
Esteemed Legend

I get the general idea but I need some rules of structure against which to code.

0 Karma
Get Updates on the Splunk Community!

Upcoming Webinar: Unmasking Insider Threats with Slunk Enterprise Security’s UEBA

Join us on Wed, Dec 10. at 10AM PST / 1PM EST for a live webinar and demo with Splunk experts! Discover how ...

.conf25 technical session recap of Observability for Gen AI: Monitoring LLM ...

If you’re unfamiliar, .conf is Splunk’s premier event where the Splunk community, customers, partners, and ...

A Season of Skills: New Splunk Courses to Light Up Your Learning Journey

There’s something special about this time of year—maybe it’s the glow of the holidays, maybe it’s the ...