Based on this search, I suspect that the raw software field is not JSON. Regardless, @richgalloway's suggestion of mvexpand is sound. But you must give examples of your software values; additionall...
See more...
Based on this search, I suspect that the raw software field is not JSON. Regardless, @richgalloway's suggestion of mvexpand is sound. But you must give examples of your software values; additionally, you probably omitted max_match=0 from your real rex command. I say this because by reverse engineering (something you should not force volunteers to do for you), I see two distinct possible format that software field can take to give out the table you illustrated. Both possible formats require max_match=0, but each possible format requires a different approach to applying mvexpand. Let me illustrate. (You should have illustrated your data in this manner.) If you do | table hostname software, you probably see the following in Splunk's statistics table However, never use a screenshot to illustrate data. (Screenshot is only useful when illustrating visualization anomalies.) The same display could come from two fundamentally different values. 1. When software is a multivalue field with distinct values like "cpe:/a:vendor1:product1:version1", "cpe:/a:vendor2:product2:version2", and so on, but all in a single event. If this is the case, all you need is to apply mvexpand to software. ``` use when 'software' is multivalue ```
| mvexpand software
| rex field=software max_match=0 "cpe:\/a:(?<Vendor>[^:]+):(?<Product>[^:]+):(?<Version>.*)"
| table hostname, Vendor, Product, Version
| dedup hostname, Vendor, Product, Version 2. When software is single-value, but multiline, like cpe:/a:vendor1:product1:version1 cpe:/a:vendor2:product2:version2 cpe:/a:vendor3:product3:version3 cpe:/a:vendor4:product4:version4 In this case, you need to first split software into single-line, multivalue before mvexpand. Like this ``` use when 'software' is multiline ```
| eval software = split(software, "
")
| mvexpand software
| rex field=software max_match=0 "cpe:\/a:(?<Vendor>[^:]+):(?<Product>[^:]+):(?<Version>.*)"
| table hostname, Vendor, Product, Version
| dedup hostname, Vendor, Product, Version In both cases, you can get the exact result you illustrated in OP. But you must know which data format you have. Here are two data emulations for you to play with and compare with real data. You can attach them to their corresponding mvexpand method to see how they turn into the desired tabulation. 1. multivalue 'software' | makeresults format=csv data="hostname
hostname1"
| eval software = split("cpe:/a:vendor1:product1:version1
cpe:/a:vendor2:product2:version2
cpe:/a:vendor3:product3:version3
cpe:/a:vendor4:product4:version4","
")
| append
[makeresults format=csv data="hostname
hostname2"
| eval software = split("cpe:/a:vendor1:product2:version2
cpe:/a:vendor2:product4:version1
cpe:/a:vendor3:product3:version5
cpe:/a:vendor4:product6:version3","
")]
``` emulates multivalue 'software' ``` 2. multline 'software' | makeresults format=csv data="hostname
hostname1"
| eval software = "cpe:/a:vendor1:product1:version1
cpe:/a:vendor2:product2:version2
cpe:/a:vendor3:product3:version3
cpe:/a:vendor4:product4:version4"
| append
[makeresults format=csv data="hostname
hostname2"
| eval software = "cpe:/a:vendor1:product2:version2
cpe:/a:vendor2:product4:version1
cpe:/a:vendor3:product3:version5
cpe:/a:vendor4:product6:version3"]
``` emulates multiline 'software' ``` Of course, reverse engineering (aka mind-reading), though laborious and generally loathed by volunteers, are often incorrect. There could be some other data format that I haven't considered that will give you the undesired output after rex; it is even possible some format will give you that undesirable output without max_match=0. If so, only you can give us the real data format (anonymize as needed) to help yourself.