<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Remove common words from two fields and keep unique values in Splunk Dev</title>
    <link>https://community.splunk.com/t5/Splunk-Dev/Remove-common-words-from-two-fields-and-keep-unique-values/m-p/320384#M4447</link>
    <description>&lt;P&gt;Thanks @knielsen, that was really helpful&lt;/P&gt;</description>
    <pubDate>Thu, 20 Jul 2017 13:50:41 GMT</pubDate>
    <dc:creator>samvijay</dc:creator>
    <dc:date>2017-07-20T13:50:41Z</dc:date>
    <item>
      <title>Remove common words from two fields and keep unique values</title>
      <link>https://community.splunk.com/t5/Splunk-Dev/Remove-common-words-from-two-fields-and-keep-unique-values/m-p/320380#M4443</link>
      <description>&lt;P&gt;Here is an interesting problem, I tried different approaches using regex, mvdbedup, coalesce etc.. it did not work. need guidance from experts. &lt;/P&gt;

&lt;P&gt;I have two fields field1 and field2 from a same event, field1 has value of "I want to buy a &lt;STRONG&gt;book&lt;/STRONG&gt;" field2 has value of "I want to buy a &lt;STRONG&gt;phone&lt;/STRONG&gt;"&lt;/P&gt;

&lt;P&gt;As you can see, the content of both the fields are same except the words &lt;STRONG&gt;book&lt;/STRONG&gt; and &lt;STRONG&gt;phone&lt;/STRONG&gt;. I want the result like below&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;field1     field2
book      phone
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I am simplifying the problem, but in reality each fields can contain a paragraph, but there will be few words which are unique in each field, which I want to extract.&lt;/P&gt;</description>
      <pubDate>Thu, 20 Jul 2017 04:54:23 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Dev/Remove-common-words-from-two-fields-and-keep-unique-values/m-p/320380#M4443</guid>
      <dc:creator>samvijay</dc:creator>
      <dc:date>2017-07-20T04:54:23Z</dc:date>
    </item>
    <item>
      <title>Re: Remove common words from two fields and keep unique values</title>
      <link>https://community.splunk.com/t5/Splunk-Dev/Remove-common-words-from-two-fields-and-keep-unique-values/m-p/320381#M4444</link>
      <description>&lt;P&gt;Now that's a fun challenge! &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;

&lt;P&gt;I only got very, well, not elegant solutions. I am sure there will be a better answer, but anyway....&lt;/P&gt;

&lt;P&gt;Are the strings guaranteed to have the same order of words except the different ones? Then this is an approach:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults | eval field1="I want to buy a book now" | eval field2="I want to buy a phone now" 
| makemv delim=" " field1 |makemv delim=" " field2| eval comb=mvzip(field1, field2) | mvexpand comb | rex field=comb "(?&amp;lt;field1&amp;gt;[^,]+),(?&amp;lt;field2&amp;gt;.+)" | where NOT field1=field2 | table field1 field2
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;If you got mixed order of words maybe, then this is an approach:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults | eval field1="I want to a book buy now" | eval field2="I want to buy a phone now" 
| makemv delim=" " field1 |makemv delim=" " field2 | eval field1sav=field1| mvexpand field1 | eval n=if(match(field2,field1),1,0) | where n=0 | mvexpand field2 | eval n=if(match(field1sav,field2),1,0) | where n=0 | table field1 field2
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;but that's still not stable for conditions like extra words in either string. But maybe this helps in finding a better solution. I am sure the regulars will jump in later. &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 20 Jul 2017 10:04:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Dev/Remove-common-words-from-two-fields-and-keep-unique-values/m-p/320381#M4444</guid>
      <dc:creator>knielsen</dc:creator>
      <dc:date>2017-07-20T10:04:51Z</dc:date>
    </item>
    <item>
      <title>Re: Remove common words from two fields and keep unique values</title>
      <link>https://community.splunk.com/t5/Splunk-Dev/Remove-common-words-from-two-fields-and-keep-unique-values/m-p/320382#M4445</link>
      <description>&lt;P&gt;Thanks @knielsen,  the string are guaranteed to have the same order, however there can be many mismatches like below, I need to get them in the same row separated by comma&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults | eval field1="I want to buy a book may be now" | eval field2="I want to buy a phone may be tomorrow" 
 | makemv delim=" " field1 |makemv delim=" " field2| eval comb=mvzip(field1, field2) | mvexpand comb | rex field=comb "(?&amp;lt;field1&amp;gt;[^,]+),(?&amp;lt;field2&amp;gt;.+)" | where NOT field1=field2 | table field1 field2 field3


field1                                   field2
book, now                          phone, tomorrow
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 20 Jul 2017 12:07:10 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Dev/Remove-common-words-from-two-fields-and-keep-unique-values/m-p/320382#M4445</guid>
      <dc:creator>samvijay</dc:creator>
      <dc:date>2017-07-20T12:07:10Z</dc:date>
    </item>
    <item>
      <title>Re: Remove common words from two fields and keep unique values</title>
      <link>https://community.splunk.com/t5/Splunk-Dev/Remove-common-words-from-two-fields-and-keep-unique-values/m-p/320383#M4446</link>
      <description>&lt;P&gt;I'll feel probably stupid when someone posts a pretty solution, but you get the correct result with this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults | eval field1="I want to buy a book may be now" | eval field2="I want to buy a phone may be tomorrow" 
  | makemv delim=" " field1 |makemv delim=" " field2| eval comb=mvzip(field1, field2) | mvexpand comb | rex field=comb "(?[^,]+),(?.+)" | where NOT field1=field2 | stats list(field1) as field1 list(field2) as field2 | eval field1=mvjoin(field1, ",") | eval field2=mvjoin(field2, ",")
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 20 Jul 2017 12:27:46 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Dev/Remove-common-words-from-two-fields-and-keep-unique-values/m-p/320383#M4446</guid>
      <dc:creator>knielsen</dc:creator>
      <dc:date>2017-07-20T12:27:46Z</dc:date>
    </item>
    <item>
      <title>Re: Remove common words from two fields and keep unique values</title>
      <link>https://community.splunk.com/t5/Splunk-Dev/Remove-common-words-from-two-fields-and-keep-unique-values/m-p/320384#M4447</link>
      <description>&lt;P&gt;Thanks @knielsen, that was really helpful&lt;/P&gt;</description>
      <pubDate>Thu, 20 Jul 2017 13:50:41 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Dev/Remove-common-words-from-two-fields-and-keep-unique-values/m-p/320384#M4447</guid>
      <dc:creator>samvijay</dc:creator>
      <dc:date>2017-07-20T13:50:41Z</dc:date>
    </item>
    <item>
      <title>Re: Remove common words from two fields and keep unique values</title>
      <link>https://community.splunk.com/t5/Splunk-Dev/Remove-common-words-from-two-fields-and-keep-unique-values/m-p/320385#M4448</link>
      <description>&lt;P&gt;@samvijay, this is answered already. I am just throwing in another option:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults 
| eval field1="I want to buy a book by today" 
| eval field2="I want to buy a phone by tomorrow"
| eval arrField1=split(field1," ")
| eval arrField2=split(field2," ")
| eval combined=mvzip(arrField1, arrField2)
| table combined
| mvexpand combined
| eval field1=replace(combined,"([^,]+),(.+)","\1")
| eval field2=replace(combined,"([^,]+),(.+)","\2")
| table field1 field2
| where field1!=field2
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 20 Jul 2017 15:12:29 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Dev/Remove-common-words-from-two-fields-and-keep-unique-values/m-p/320385#M4448</guid>
      <dc:creator>niketn</dc:creator>
      <dc:date>2017-07-20T15:12:29Z</dc:date>
    </item>
    <item>
      <title>Re: Remove common words from two fields and keep unique values</title>
      <link>https://community.splunk.com/t5/Splunk-Dev/Remove-common-words-from-two-fields-and-keep-unique-values/m-p/320386#M4449</link>
      <description>&lt;P&gt;Thanks @niketnilay&lt;/P&gt;</description>
      <pubDate>Fri, 21 Jul 2017 06:49:29 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Dev/Remove-common-words-from-two-fields-and-keep-unique-values/m-p/320386#M4449</guid>
      <dc:creator>samvijay</dc:creator>
      <dc:date>2017-07-21T06:49:29Z</dc:date>
    </item>
  </channel>
</rss>

