Splunk Search

tonumber for numerical rather than lexigraphical sort in a multivalue field

chirsf
Explorer

I'm looking for a way to numerically sort a multivalue field without expanding the field, sorting and then recombining. Essentially something like the following with a dedup. Before I post an idea I wanted to see if anyone has any ideas on how to accomplish this.

Essentially I'm trying to find a way to run conversion functions from here against a multivalue field, ideally on the same eval.

 

| makeresults 
| eval line_list=split("4,3,2,1,7,2,21,9,12,19,7",",")
| eval line_list=mvsort(mvdedup(tonumber(line_list)))

 

 

Instead I have to do this:

 

| makeresults 
| eval line_list=split("4,3,2,1,7,2,21,9,12,19,7",",") 
| mvexpand line_list 
| eval line_list=tonumber(line_list)
| sort line_list 
| dedup line_list 
| mvcombine line_list 

 

 

Labels (1)

dhirendra761
Contributor

I think you already have a good solution.

Just optimized the same:

| makeresults 
| eval line_list=split("4,3,2,1,7,2,21,9,12,19,7",",") 
| mvexpand line_list 
| dedup line_list sortby line_list 
| mvcombine line_list 
| table line_list
0 Karma

TheWoodRanger
Explorer

You have to understand the behavior of a multivalue field; the MV field itself is essentially always treated as an array of strings:  ["val1", "val2", "val3"] so even in the example you noted where you mvexpand it out, running mvsort on the final product after you condense it again will sort it back to lexographic ordering.

Given that, you have to build a solution with lexographic sorting in mind.

First, I want to note that the more densely packed you get with nested evals, the more likely it is to break over time as the underlying search core engine may change. I've seen this happen, so better to split it apart into multiple evals than aim to run it all at once, since it allows Splunk to handle memory transition better that way anyways.

 

That said, here's your solution:

| makeresults | fields - _time
| eval number_list = split("4,3,2,1,7,2,21,9,12,19,7",",")
| eval number_list_ordered = mvmap( mvsort(mvmap(mvdedup(number_list), len(number_list) . "-" . number_list)), substr(number_list, 3) )

 

 

 

This will work for up to 9 character length values in a MV, and uses TWO cascading mvmap() functions to attach a number representing the length of the given value to the value itself, which makes the lexographic sorting work as we want it to.To illustrate the steps:

| makeresults | fields - _time
| eval number_list = split("4,3,2,1,7,2,21,9,12,19,7",",") 
| eval number_list_2 = mvdedup(number_list)
| eval number_list_3 = mvmap(number_list_2, len(number_list_2) . "-" . number_list_2)
| eval number_list_4 = mvsort(number_list_3)
| eval number_list_final = mvmap(number_list_4, substr(number_list_4, 3))

 

 

 

 

Make sure to upvote.

 

ITWhisperer
SplunkTrust
SplunkTrust

You could left-space-pad before sorting, then join and split to remove the space (if required) - you still end up with strings rather than numbers but it might be sufficient for you - note that I have assumed a maximum of 3 digits so you may need to adjust for your case.

 

| makeresults 
| eval line_list=split("4,3,2,1,7,2,21,9,123,12,19,7",",")
| eval line_list=split(mvjoin(mvsort(mvmap(line_list,printf("%*s%d",3-(floor(log(line_list,10)))," ",tonumber(line_list))))," ")," ")

 

Inspired by family @woodcock - check out their YouTube post https://www.youtube.com/watch?v=d0XtQcPa5zw 

0 Karma
Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...