Splunk Search

Removing duplicate substrings from a multivalue field?

smlrwd
Explorer

Hello everyone,

I am creating a custom asset inventory and am combining data from multiple sources. These sources don't return the same OS version and a multivalue field is created in the process. I tabled out the OS version to take a look at the different multivalue fields. Below is an example of one of the multivalue entries in the OS field.

Win XP
Win XP Pro

How would I go about matching the smaller string to the larger one and removing it?

Edit:
To clarify, I want to remove the less descriptive OS version if it matches another version already in the list. In the example above I want to remove "Win XP".

A more difficult example is:

Win XP
Win SRV 2008
Win XP Pro

With an output of

Win XP Pro
Win SRV 2008
1 Solution

smlrwd
Explorer

I worked out something thanks to woodcock. It may not be the prettiest but it seems to work. (Ignore the % instead of spaces, the spaces were getting in the way earlier so i switched them out) I broke the list of OSes into 4 categories: Win XP, Win 7, Win Server, and Other. I created a macro to perform the code woodcock provided to single out an OS from the 3 Win categories, then appended them all together. From what I can tell it works as I intended.

... | mvexpand OS | eval os1=case(match(OS, "XP"), OS) | eval os2=case(match(OS, "SRV"), OS) | eval os3=case(match(OS, "Win%7"), OS) | eval os4=case( (NOT match(OS, "XP")) AND (NOT match(OS, "SRV")) AND (NOT match(OS, "Win%7")), OS) | stats values(*) as * by nt_host | fillnull value="" os1, os2, os3, os4 | `return_longest_from_mv( os1, nt_host )` | `return_longest_from_mv( os2, nt_host )` | `return_longest_from_mv( os3, nt_host )` | eval OS=mvappend(os1, os2, os3, os4)

Thanks again woodcock.

View solution in original post

0 Karma

smlrwd
Explorer

I worked out something thanks to woodcock. It may not be the prettiest but it seems to work. (Ignore the % instead of spaces, the spaces were getting in the way earlier so i switched them out) I broke the list of OSes into 4 categories: Win XP, Win 7, Win Server, and Other. I created a macro to perform the code woodcock provided to single out an OS from the 3 Win categories, then appended them all together. From what I can tell it works as I intended.

... | mvexpand OS | eval os1=case(match(OS, "XP"), OS) | eval os2=case(match(OS, "SRV"), OS) | eval os3=case(match(OS, "Win%7"), OS) | eval os4=case( (NOT match(OS, "XP")) AND (NOT match(OS, "SRV")) AND (NOT match(OS, "Win%7")), OS) | stats values(*) as * by nt_host | fillnull value="" os1, os2, os3, os4 | `return_longest_from_mv( os1, nt_host )` | `return_longest_from_mv( os2, nt_host )` | `return_longest_from_mv( os3, nt_host )` | eval OS=mvappend(os1, os2, os3, os4)

Thanks again woodcock.

0 Karma

woodcock
Esteemed Legend

If we assume that the shortest is always wrong, and that your unique field is host and your OS field is OS, then you can do this:

... | mvexpand OS | eval lenOS=length(OS) | eventstats max(lenOS) AS maxOSlen by host | eval bestOSvalue=if((lenOS==maxOSlen),OS,null()) | stats value(*) AS * by host,bestOSvalue | fields - OS | rename bestOSvalue AS "OS"

smlrwd
Explorer

This works perfectly for workstations. The only problem is that this method doesn't work well for the servers that report multiple OSes. Any ideas on how this could be modified to work with:

Win SRV 2008
Win XP
Win XP Pro
0 Karma

woodcock
Esteemed Legend

You forgot to mention your desired output but I assume it would be:

Win SRV 2008
Win XP Pro

It is legitimate for a single server to report multiple (host) OS; that doesn't make sense to me! This is quite a but trickier...

0 Karma

smlrwd
Explorer

Yes, you assumed correctly. I have been trying to fix this off and on since yesterday and I just can't seem to get it quite right. Maybe regex would be helpful. I tried matching groups but it's over my head. Edit: I updated the original question to include this example.

0 Karma

stephanefotso
Motivator

Please let me understand. In your table above which value do you want to remove?

SGF
0 Karma

smlrwd
Explorer

I updated the original post to clarify. Thank you.

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...