Splunk Search

Removing duplicate substrings from a multivalue field?

smlrwd
Explorer

Hello everyone,

I am creating a custom asset inventory and am combining data from multiple sources. These sources don't return the same OS version and a multivalue field is created in the process. I tabled out the OS version to take a look at the different multivalue fields. Below is an example of one of the multivalue entries in the OS field.

Win XP
Win XP Pro

How would I go about matching the smaller string to the larger one and removing it?

Edit:
To clarify, I want to remove the less descriptive OS version if it matches another version already in the list. In the example above I want to remove "Win XP".

A more difficult example is:

Win XP
Win SRV 2008
Win XP Pro

With an output of

Win XP Pro
Win SRV 2008
1 Solution

smlrwd
Explorer

I worked out something thanks to woodcock. It may not be the prettiest but it seems to work. (Ignore the % instead of spaces, the spaces were getting in the way earlier so i switched them out) I broke the list of OSes into 4 categories: Win XP, Win 7, Win Server, and Other. I created a macro to perform the code woodcock provided to single out an OS from the 3 Win categories, then appended them all together. From what I can tell it works as I intended.

... | mvexpand OS | eval os1=case(match(OS, "XP"), OS) | eval os2=case(match(OS, "SRV"), OS) | eval os3=case(match(OS, "Win%7"), OS) | eval os4=case( (NOT match(OS, "XP")) AND (NOT match(OS, "SRV")) AND (NOT match(OS, "Win%7")), OS) | stats values(*) as * by nt_host | fillnull value="" os1, os2, os3, os4 | `return_longest_from_mv( os1, nt_host )` | `return_longest_from_mv( os2, nt_host )` | `return_longest_from_mv( os3, nt_host )` | eval OS=mvappend(os1, os2, os3, os4)

Thanks again woodcock.

View solution in original post

0 Karma

smlrwd
Explorer

I worked out something thanks to woodcock. It may not be the prettiest but it seems to work. (Ignore the % instead of spaces, the spaces were getting in the way earlier so i switched them out) I broke the list of OSes into 4 categories: Win XP, Win 7, Win Server, and Other. I created a macro to perform the code woodcock provided to single out an OS from the 3 Win categories, then appended them all together. From what I can tell it works as I intended.

... | mvexpand OS | eval os1=case(match(OS, "XP"), OS) | eval os2=case(match(OS, "SRV"), OS) | eval os3=case(match(OS, "Win%7"), OS) | eval os4=case( (NOT match(OS, "XP")) AND (NOT match(OS, "SRV")) AND (NOT match(OS, "Win%7")), OS) | stats values(*) as * by nt_host | fillnull value="" os1, os2, os3, os4 | `return_longest_from_mv( os1, nt_host )` | `return_longest_from_mv( os2, nt_host )` | `return_longest_from_mv( os3, nt_host )` | eval OS=mvappend(os1, os2, os3, os4)

Thanks again woodcock.

0 Karma

woodcock
Esteemed Legend

If we assume that the shortest is always wrong, and that your unique field is host and your OS field is OS, then you can do this:

... | mvexpand OS | eval lenOS=length(OS) | eventstats max(lenOS) AS maxOSlen by host | eval bestOSvalue=if((lenOS==maxOSlen),OS,null()) | stats value(*) AS * by host,bestOSvalue | fields - OS | rename bestOSvalue AS "OS"

smlrwd
Explorer

This works perfectly for workstations. The only problem is that this method doesn't work well for the servers that report multiple OSes. Any ideas on how this could be modified to work with:

Win SRV 2008
Win XP
Win XP Pro
0 Karma

woodcock
Esteemed Legend

You forgot to mention your desired output but I assume it would be:

Win SRV 2008
Win XP Pro

It is legitimate for a single server to report multiple (host) OS; that doesn't make sense to me! This is quite a but trickier...

0 Karma

smlrwd
Explorer

Yes, you assumed correctly. I have been trying to fix this off and on since yesterday and I just can't seem to get it quite right. Maybe regex would be helpful. I tried matching groups but it's over my head. Edit: I updated the original question to include this example.

0 Karma

stephanefotso
Motivator

Please let me understand. In your table above which value do you want to remove?

SGF
0 Karma

smlrwd
Explorer

I updated the original post to clarify. Thank you.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...