<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Create a new field and populate it with normalized data? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Create-a-new-field-and-populate-it-with-normalized-data/m-p/331311#M164492</link>
    <description>&lt;P&gt;@dbcase, if you can use the upper case character in second string as the pattern to end field extraction, you can use the rex command &lt;CODE&gt;| rex field=&amp;lt;YourFieldNameGoesHere&amp;gt; "(?&amp;lt;mso_new&amp;gt;[A-Z]{1}[^A-Z]+)"&lt;/CODE&gt;.&lt;BR /&gt;
Following is a run anywhere search for you to try out:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults
| eval mso="ExxonUSA,ExxonEurope,ExxonAPAC,ShellUSA,ShellSouthAmerica,ShellNZ,ChevronCalifornia,ChevronIceland,Chevron"
| makemv mso delim="," 
| mvexpand mso
| rex field=mso "(?&amp;lt;mso_new&amp;gt;[A-Z]{1}[^A-Z]+)"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;PS: The regular expression means start with Capital letter and extract all characters before find next Upper Case character. You can test the same on regex101.com as well.&lt;/P&gt;</description>
    <pubDate>Fri, 08 Dec 2017 02:55:04 GMT</pubDate>
    <dc:creator>niketn</dc:creator>
    <dc:date>2017-12-08T02:55:04Z</dc:date>
    <item>
      <title>Create a new field and populate it with normalized data?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Create-a-new-field-and-populate-it-with-normalized-data/m-p/331308#M164489</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;

&lt;P&gt;I have a field in my existing data set called mso.  Within that field are company names&lt;/P&gt;

&lt;P&gt;Example&lt;/P&gt;

&lt;P&gt;CompanyA&lt;BR /&gt;
CompanyAA1&lt;BR /&gt;
CompanyABB1&lt;BR /&gt;
CoB&lt;BR /&gt;
CoBBBB1&lt;BR /&gt;
CompC&lt;BR /&gt;
CompC2&lt;/P&gt;

&lt;P&gt;What I would like to do is create a new field called normalized_mso with normalized company names. so that it looks like the below&lt;/P&gt;

&lt;P&gt;CompanyA&lt;BR /&gt;
CoB&lt;BR /&gt;
CompC&lt;/P&gt;

&lt;P&gt;How could I accomplish this?&lt;/P&gt;</description>
      <pubDate>Fri, 08 Dec 2017 00:47:35 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Create-a-new-field-and-populate-it-with-normalized-data/m-p/331308#M164489</guid>
      <dc:creator>dbcase</dc:creator>
      <dc:date>2017-12-08T00:47:35Z</dc:date>
    </item>
    <item>
      <title>Re: Create a new field and populate it with normalized data?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Create-a-new-field-and-populate-it-with-normalized-data/m-p/331309#M164490</link>
      <description>&lt;P&gt;You need to give us the rules for equation.  For example:&lt;BR /&gt;
All names start with and end with a single capital letter with all-lowercase between.&lt;BR /&gt;
Everything after the 2nd (ending) capital letter is trash.&lt;/P&gt;</description>
      <pubDate>Fri, 08 Dec 2017 00:57:44 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Create-a-new-field-and-populate-it-with-normalized-data/m-p/331309#M164490</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2017-12-08T00:57:44Z</dc:date>
    </item>
    <item>
      <title>Re: Create a new field and populate it with normalized data?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Create-a-new-field-and-populate-it-with-normalized-data/m-p/331310#M164491</link>
      <description>&lt;P&gt;Well the names are kinda all over the map, thats why I made the examples variable.&lt;/P&gt;

&lt;P&gt;Right now I'm matching using "like" and while it mostly works, what I really need is a new field in the existing data set.&lt;/P&gt;

&lt;P&gt;Here is a better example&lt;/P&gt;

&lt;P&gt;Company names (made up)&lt;/P&gt;

&lt;P&gt;ExxonUSA&lt;BR /&gt;
ExxonEurope&lt;BR /&gt;
ExxonAPAC&lt;BR /&gt;
ShellUSA&lt;BR /&gt;
ShellSouthAmerica&lt;BR /&gt;
ShellNZ&lt;BR /&gt;
ChevronCalifornia&lt;BR /&gt;
ChevronIceland&lt;BR /&gt;
Chevron&lt;/P&gt;

&lt;P&gt;I'd like to end up with &lt;BR /&gt;
Exxon&lt;BR /&gt;
Shell&lt;BR /&gt;
Chevron&lt;/P&gt;</description>
      <pubDate>Fri, 08 Dec 2017 01:07:18 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Create-a-new-field-and-populate-it-with-normalized-data/m-p/331310#M164491</guid>
      <dc:creator>dbcase</dc:creator>
      <dc:date>2017-12-08T01:07:18Z</dc:date>
    </item>
    <item>
      <title>Re: Create a new field and populate it with normalized data?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Create-a-new-field-and-populate-it-with-normalized-data/m-p/331311#M164492</link>
      <description>&lt;P&gt;@dbcase, if you can use the upper case character in second string as the pattern to end field extraction, you can use the rex command &lt;CODE&gt;| rex field=&amp;lt;YourFieldNameGoesHere&amp;gt; "(?&amp;lt;mso_new&amp;gt;[A-Z]{1}[^A-Z]+)"&lt;/CODE&gt;.&lt;BR /&gt;
Following is a run anywhere search for you to try out:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults
| eval mso="ExxonUSA,ExxonEurope,ExxonAPAC,ShellUSA,ShellSouthAmerica,ShellNZ,ChevronCalifornia,ChevronIceland,Chevron"
| makemv mso delim="," 
| mvexpand mso
| rex field=mso "(?&amp;lt;mso_new&amp;gt;[A-Z]{1}[^A-Z]+)"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;PS: The regular expression means start with Capital letter and extract all characters before find next Upper Case character. You can test the same on regex101.com as well.&lt;/P&gt;</description>
      <pubDate>Fri, 08 Dec 2017 02:55:04 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Create-a-new-field-and-populate-it-with-normalized-data/m-p/331311#M164492</guid>
      <dc:creator>niketn</dc:creator>
      <dc:date>2017-12-08T02:55:04Z</dc:date>
    </item>
    <item>
      <title>Re: Create a new field and populate it with normalized data?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Create-a-new-field-and-populate-it-with-normalized-data/m-p/331312#M164493</link>
      <description>&lt;P&gt;Hi Niketnilay,&lt;/P&gt;

&lt;P&gt;You are a busy man!!!  Sadly the real data doesn't necessarily have an uppercase character.&lt;/P&gt;</description>
      <pubDate>Fri, 08 Dec 2017 04:11:38 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Create-a-new-field-and-populate-it-with-normalized-data/m-p/331312#M164493</guid>
      <dc:creator>dbcase</dc:creator>
      <dc:date>2017-12-08T04:11:38Z</dc:date>
    </item>
    <item>
      <title>Re: Create a new field and populate it with normalized data?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Create-a-new-field-and-populate-it-with-normalized-data/m-p/331313#M164494</link>
      <description>&lt;P&gt;@dbcase &lt;span class="lia-unicode-emoji" title=":grinning_face_with_smiling_eyes:"&gt;😄&lt;/span&gt; LOL I am on leave today, travelling to hometown &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;

&lt;P&gt;If there is no pattern your option would be to use lookup with wildcard. Refer to this answer &lt;A href="https://answers.splunk.com/answers/52580/can-we-use-wildcard-characters-in-a-lookup-table.html"&gt;https://answers.splunk.com/answers/52580/can-we-use-wildcard-characters-in-a-lookup-table.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 08 Dec 2017 04:22:07 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Create-a-new-field-and-populate-it-with-normalized-data/m-p/331313#M164494</guid>
      <dc:creator>niketn</dc:creator>
      <dc:date>2017-12-08T04:22:07Z</dc:date>
    </item>
    <item>
      <title>Re: Create a new field and populate it with normalized data?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Create-a-new-field-and-populate-it-with-normalized-data/m-p/331314#M164495</link>
      <description>&lt;P&gt;I get the general idea but I need some rules of structure against which to code.&lt;/P&gt;</description>
      <pubDate>Fri, 08 Dec 2017 06:09:58 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Create-a-new-field-and-populate-it-with-normalized-data/m-p/331314#M164495</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2017-12-08T06:09:58Z</dc:date>
    </item>
    <item>
      <title>Re: Create a new field and populate it with normalized data?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Create-a-new-field-and-populate-it-with-normalized-data/m-p/331315#M164496</link>
      <description>&lt;P&gt;This works against your examples.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults
| eval mso="ExxonUSA,ExxonEurope,ExxonAPAC,ShellUSA,ShellSouthAmerica,ShellNZ,ChevronCalifornia,ChevronIceland,Chevron"
| makemv mso delim="," 
| mvexpand mso
| eval mso2=mso
| rex mode=sed field=mso2 "s/(^[A-Z]+?[a-z]+)([A-Z]*?[a-z]*?[a-zA-Z]*?$)/\1/g"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;HR /&gt;

&lt;P&gt;You may need to post any examples you have of companies that have numbers, special characters, or all lower case for testing. This works against a few more scenarios...&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; | makeresults
 | eval mso="ExxonUSA,ExxonEurope,ExxonAPAC,ShellUSA,ShellSouthAmerica,ShellNZ,ChevronCalifornia,ChevronIceland,Chevron,blahblahTheblah,Protection1Green,327weirdness Inc,Iceberg Titanic Company"
 | makemv mso delim="," 
 | mvexpand mso
 | eval mso2=mso
 | rex mode=sed field=mso2 "s/^([^a-z]*)([a-z0-9]+)([^a-z0-9\n]*)([a-zA-Z0-9 ]*)$/\1\2/g"
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 08 Dec 2017 18:06:02 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Create-a-new-field-and-populate-it-with-normalized-data/m-p/331315#M164496</guid>
      <dc:creator>DalJeanis</dc:creator>
      <dc:date>2017-12-08T18:06:02Z</dc:date>
    </item>
  </channel>
</rss>

