<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Removing duplicate substrings from a multivalue field? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Removing-duplicate-substrings-from-a-multivalue-field/m-p/154001#M43274</link>
    <description>&lt;P&gt;I updated the original post to clarify. Thank you.&lt;/P&gt;</description>
    <pubDate>Thu, 11 Jun 2015 14:49:01 GMT</pubDate>
    <dc:creator>smlrwd</dc:creator>
    <dc:date>2015-06-11T14:49:01Z</dc:date>
    <item>
      <title>Removing duplicate substrings from a multivalue field?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Removing-duplicate-substrings-from-a-multivalue-field/m-p/153999#M43272</link>
      <description>&lt;P&gt;Hello everyone,&lt;/P&gt;

&lt;P&gt;I am creating a custom asset inventory and am combining data from multiple sources. These sources don't return the same OS version and a multivalue field is created in the process. I tabled out the OS version to take a look at the different multivalue fields. Below is an example of one of the multivalue entries in the OS field. &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Win XP
Win XP Pro
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;How would I go about matching the smaller string to the larger one and removing it?&lt;/P&gt;

&lt;P&gt;Edit:&lt;BR /&gt;
To clarify, I want to remove the less descriptive OS version if it matches another version already in the list. In the example above I want to remove "Win XP".&lt;/P&gt;

&lt;P&gt;A more difficult example is:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Win XP
Win SRV 2008
Win XP Pro
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;With an output of &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Win XP Pro
Win SRV 2008
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 11 Jun 2015 14:11:32 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Removing-duplicate-substrings-from-a-multivalue-field/m-p/153999#M43272</guid>
      <dc:creator>smlrwd</dc:creator>
      <dc:date>2015-06-11T14:11:32Z</dc:date>
    </item>
    <item>
      <title>Re: Removing duplicate substrings from a multivalue field?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Removing-duplicate-substrings-from-a-multivalue-field/m-p/154000#M43273</link>
      <description>&lt;P&gt;Please let me understand. In your table above which value do you want to remove?&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jun 2015 14:28:20 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Removing-duplicate-substrings-from-a-multivalue-field/m-p/154000#M43273</guid>
      <dc:creator>stephanefotso</dc:creator>
      <dc:date>2015-06-11T14:28:20Z</dc:date>
    </item>
    <item>
      <title>Re: Removing duplicate substrings from a multivalue field?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Removing-duplicate-substrings-from-a-multivalue-field/m-p/154001#M43274</link>
      <description>&lt;P&gt;I updated the original post to clarify. Thank you.&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jun 2015 14:49:01 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Removing-duplicate-substrings-from-a-multivalue-field/m-p/154001#M43274</guid>
      <dc:creator>smlrwd</dc:creator>
      <dc:date>2015-06-11T14:49:01Z</dc:date>
    </item>
    <item>
      <title>Re: Removing duplicate substrings from a multivalue field?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Removing-duplicate-substrings-from-a-multivalue-field/m-p/154002#M43275</link>
      <description>&lt;P&gt;If we assume that the shortest is always wrong, and that your unique field is &lt;CODE&gt;host&lt;/CODE&gt; and your OS field is &lt;CODE&gt;OS&lt;/CODE&gt;, then you can do this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;... | mvexpand OS | eval lenOS=length(OS) | eventstats max(lenOS) AS maxOSlen by host | eval bestOSvalue=if((lenOS==maxOSlen),OS,null()) | stats value(*) AS * by host,bestOSvalue | fields - OS | rename bestOSvalue AS "OS"
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 11 Jun 2015 15:06:19 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Removing-duplicate-substrings-from-a-multivalue-field/m-p/154002#M43275</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2015-06-11T15:06:19Z</dc:date>
    </item>
    <item>
      <title>Re: Removing duplicate substrings from a multivalue field?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Removing-duplicate-substrings-from-a-multivalue-field/m-p/154003#M43276</link>
      <description>&lt;P&gt;This works perfectly for workstations. The only problem is that this method doesn't work well for the servers that report multiple OSes. Any ideas on how this could be modified to work with:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Win SRV 2008
Win XP
Win XP Pro
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 11 Jun 2015 15:22:53 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Removing-duplicate-substrings-from-a-multivalue-field/m-p/154003#M43276</guid>
      <dc:creator>smlrwd</dc:creator>
      <dc:date>2015-06-11T15:22:53Z</dc:date>
    </item>
    <item>
      <title>Re: Removing duplicate substrings from a multivalue field?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Removing-duplicate-substrings-from-a-multivalue-field/m-p/154004#M43277</link>
      <description>&lt;P&gt;You forgot to mention your desired output but I assume it would be:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Win SRV 2008
Win XP Pro
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;It is legitimate for a single server to report multiple (host) OS; that doesn't make sense to me!  This is quite a but trickier...&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jun 2015 15:28:03 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Removing-duplicate-substrings-from-a-multivalue-field/m-p/154004#M43277</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2015-06-11T15:28:03Z</dc:date>
    </item>
    <item>
      <title>Re: Removing duplicate substrings from a multivalue field?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Removing-duplicate-substrings-from-a-multivalue-field/m-p/154005#M43278</link>
      <description>&lt;P&gt;Yes, you assumed correctly. I have been trying to fix this off and on since yesterday and I just can't seem to get it quite right. Maybe regex would be helpful. I tried matching groups but it's over my head. Edit: I updated the original question to include this example. &lt;/P&gt;</description>
      <pubDate>Thu, 11 Jun 2015 15:32:18 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Removing-duplicate-substrings-from-a-multivalue-field/m-p/154005#M43278</guid>
      <dc:creator>smlrwd</dc:creator>
      <dc:date>2015-06-11T15:32:18Z</dc:date>
    </item>
    <item>
      <title>Re: Removing duplicate substrings from a multivalue field?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Removing-duplicate-substrings-from-a-multivalue-field/m-p/154006#M43279</link>
      <description>&lt;P&gt;I worked out something thanks to woodcock. It may not be the prettiest but it seems to work. (Ignore the % instead of spaces, the spaces were getting in the way earlier so i switched them out) I broke the list of OSes into 4 categories: Win XP, Win 7, Win Server, and Other. I created a macro to perform the code woodcock provided to single out an OS from the 3 Win categories, then appended them all together. From what I can tell it works as I intended.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;... | mvexpand OS | eval os1=case(match(OS, "XP"), OS) | eval os2=case(match(OS, "SRV"), OS) | eval os3=case(match(OS, "Win%7"), OS) | eval os4=case( (NOT match(OS, "XP")) AND (NOT match(OS, "SRV")) AND (NOT match(OS, "Win%7")), OS) | stats values(*) as * by nt_host | fillnull value="" os1, os2, os3, os4 | `return_longest_from_mv( os1, nt_host )` | `return_longest_from_mv( os2, nt_host )` | `return_longest_from_mv( os3, nt_host )` | eval OS=mvappend(os1, os2, os3, os4)
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Thanks again woodcock.&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jun 2015 17:44:14 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Removing-duplicate-substrings-from-a-multivalue-field/m-p/154006#M43279</guid>
      <dc:creator>smlrwd</dc:creator>
      <dc:date>2015-06-11T17:44:14Z</dc:date>
    </item>
  </channel>
</rss>

