It looks like a join
will break multivalues. And I thought mvexpand
couldn't get any more dangerous or misleading than it already is.
This is the query:
index=wineventlog host=MYVEEAMHOST source="WinEventLog:Veeam Backup" EventCode=190 |
rex field=Message "(?i)^Backup job \'(?<snapshot>.+?)\'" | rex field=Message "(?i)^Backup job \'.*\' finished with (?<result>[^\.]+)" | rex field=snapshot "(?i)^(?<server>\d\S+)" | rex field=snapshot "(?i)^(?<datastore>ds-\S+)" |
join type=left datastore [search index=vmware-inv vm_name!="" datastores{}.name!="" earliest=-1d@d latest=@d | sort 0 - _time | dedup vm_name | mvexpand datastores{}.name | rename "datastores{}.name" AS datastore | stats values(vm_name) AS vms BY datastore] |
fillnull value="" | mvexpand vms | table server datastore vms _time snapshot Message
The subsearch inside the join, copied and pasted exactly into a separate search, works correctly and returns vms
as a mv field. The main search instead after the join returns vms
as a single string with whitespace between the values, e.g. 1ABC-ABC-DC1 1DEF-DEF-DC1 1DEF-DEF-DC2
Why...? I am not looking for a workaround as I can obviously solve with split
, I am looking for an answer.
RUN ANYWHERE CODE THAT HAS SAME ISSUE:
| makeresults | eval a="1" | join type=left a [| makeresults | eval a="1" | eval b="2" | append [| makeresults | eval a="1" | eval b="3"] | mvcombine b] | mvexpand b
So, first, if you want the left side of the join
to join to multiple records on the right side of the join, then you have to use the max=
parameter. Otherwise, the record on the left will join only to the first record that comes out of the right side.
Next, you are correct, a join will treat the multivalue records as if they are a single string.
| makeresults | eval a="1"
| join type=left a
[| makeresults | eval a="1" | eval b=mvappend("2","3") | eval binnercount=mvcount(b)]
| eval boutercount=mvcount(b)
In general, join
is one of the last join-type verbs that you go for. There are other verbs that do the same effect more efficiently. Here's an example of using stats
instead.
| makeresults | eval a="1", rectype="first"
| join type=left a
[| makeresults | eval a="1", rectype="second" | eval b=mvappend("2","3") | eval binnercount=mvcount(b) ]
| eval boutercount=mvcount(b)
| stats values(*) as * by a
Your resulting code would probably look something like this.
index=wineventlog host=MYVEEAMHOST source="WinEventLog:Veeam Backup" EventCode=190
| rex field=Message "(?i)^Backup job \'(?<snapshot>.+?)\'"
| rex field=Message "(?i)^Backup job \'.*\' finished with (?<result>[^\.]+)"
| rex field=snapshot "(?i)^(?<server>\d\S+)"
| rex field=snapshot "(?i)^(?<datastore>ds-\S+)"
| fields _time snapshot result server datastore Message
| append
[search index=vmware-inv vm_name!="" datastores{}.name!="" earliest=-1d@d latest=@d
| sort 0 - _time | dedup vm_name
| mvexpand datastores{}.name
| rename "datastores{}.name" AS datastore
| stats values(vm_name) AS vms BY datastore
| rename comment as "mark these records so we can kill them later"
| eval killme="killme"]
| rename comment as "roll the vms records over to the log messages then kill the vms records"
| eventstats values(*) as * by datastore
| where isnull(killme)
| rename comment as "expand teh vms and present the results."
| mvexpand vms
| table server datastore vms _time snapshot Message