Splunk Search

Multivalue fields with numeric values don't play nice with > and < in the search command.

sideview
SplunkTrust
SplunkTrust

We frequently have search results where for one or more numeric fields, each row might have only one value for the numeric field or the row might have a multivalued value for that numeric field.

Here is a made up search duplicating the basic situation. Paste this into your search bar.

| stats count as numeric | eval numeric=mvrange(3,8) | mvexpand numeric | eval categorical=case(numeric%3=0,"A",numeric%3=1,"B",numeric%3=2,"C") | stats values(numeric) as mvnumeric by categorical

This dummy search will give you results of

|    categorical   |    mvnumeric    |
|    A             |     3           |
|                  |     6           |                         
-------------------------------------- 
|    B             |     4           |
|                  |     7           |                         
--------------------------------------
|    C             |     5           |

Note in particular that the first two rows have multivalued values and the last row has only a regular single value.

Now say we add a search of | search mvnumeric>0 to the end to filter this set.

Since all the rows have a numeric value greater than zero, in fact since all values are greater than zero, I'd expect all rows to get returned.

However only the value with the single-value gets returned. Likewise with other terms you can see that the greater-than and less-than operators just don't work with any multivalue rows - the rows always fail to match.
Is this a known bug? Is there any magic to be worked in the search language?

mvexpand offers a sort of a workaround, but | mvexpand | search mvnumeric>4 doesn't work cause that'll throw away the other values and we need the whole picture on the final rows.

Which leaves this sort of thing and it's way too clunky to be of use here..

| mvexpand mvnumeric | eval matchesOurExpression=if(mvnumeric>N,1,0) | stats values(*) as * by <the id fields by which we were grouping before we mvexpanded> | search matchesOurExpression=1

jeffland
SplunkTrust
SplunkTrust

Just to clarify, if you have a row with multivalued field "mvnumeric" and the values of that multivalued field are 3 and 6 (just like in the first row of your example), and you do a search for "mvnumeric>3", do you want the entire row to stay because it contains a 6, or do you want to discard it because not all of your values in that row are >3?
This question is the reason you can't do a search like "search mvnumeric>3", because it can be answered either way, depending on what you want. It is not precise enough.

The good news is, you've probably heard of mvfilter (just realized who asked the question :)) It might help you, see here:

... | eval keep=mvfilter(mvnumeric>6) | where mvcount(keep)>=1

This will remove all results in which the highest available number is less than 6. Set that to 0, and you will filter out all rows which only have negative values. Or do it like this:

| eval keep=mvfilter(mvnumeric>3) | where mvcount(mvnumeric)=mvcount(keep)

This will remove any row which contains numbers ❤️ (in your data, the second row).

sideview
SplunkTrust
SplunkTrust

Yes - I realize there's remaining ambiguity on how exactly it should work. I started trying to get into that too in the question. Ultimately I think it's a false problem though. It seems like any choice retains this ambiguity, but one of the two choices actually leads to a fully consistent and flexible system. I think.

To explain -- I think it should have the behavior where mvnumeric>N matches even if only one of the values is greater than N. Then for when the user needs the opposite behavior (matching only when all of the values are greater than N), they can do NOT mvnumeric<=N and everyone gets their cake and gets to eat it too. 😃
I wouldn't put it past the Search and Indexing team at Splunk to have thought through all this years ago and I was quite surprised to find neither way implemented on multivalue numeric fields!!

Whereas if they had it default the other way I don't think the language would retain the same flexibility. also..... mvnumeric>N matching even if only one of the values is greater than N happens to be what I need here. 😃

Thanks for the mvfilter solution. It's definitely less evil than mvexpanding/painting-with-eval/stats values(*) by id1 id2 id3/filtering. still not really much practical help to me in my customer case here but thanks very much.

martin_mueller
SplunkTrust
SplunkTrust

I agree with "should match if at least one mv member matches" - that would line up with the behaviour for | search mvfield="foo", it'll match if one member matches "foo".

For the generic case with as-is SPL I don't see a better way than | where mvcount(mvfilter(mvnumeric > 0)) > 0. Any kind of eventstats-powered thing is going to be expensive over large result sets, and anything involving mvexpand is going to be even more expensive.

0 Karma

somesoni2
Revered Legend

Just for your specific criteria (mvnumeric>0), try this

| stats count as numeric | eval numeric=mvrange(3,8) | mvexpand numeric | eval categorical=case(numeric%3=0,"A",numeric%3=1,"B",numeric%3=2,"C") | stats values(numeric) as mvnumeric by categorical 
| eventstats min(mvnumeric) as min by categorical | where min>0

OR

| stats count as numeric | eval numeric=mvrange(3,8) | mvexpand numeric | eval categorical=case(numeric%3=0,"A",numeric%3=1,"B",numeric%3=2,"C") | stats values(numeric) as mvnumeric by categorical 
| where isnotnull(mvfilter(mvnumeric>0))

sideview
SplunkTrust
SplunkTrust

Actually this will almost work. We're already in a stats <> by id1 id2 id3 situation here, so with that extra assumption in hand, I can just do

| eventstats min(mvnumeric) as tempMin by id1 id2 id3 | where tempMin>0 and be done.

0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

WATCH NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If exploited, ...

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...