Splunk Search

Is it possible to sort or reorder a multivalue field?

Lowell
Super Champion

Anyone have any thoughts as to how to reorder a multi-valued field? Ideally I'd like to be able to do a "sort" or in my specific use case, a "reverse" would be perfect.

Say you have the following search:

my search | stats list(myfield) as myfields by id

The list() stats operator preserves all values of "myfield" in the events and preserves order, which is what I want. However, I'd really like to see the values of "myfield" in time order (not reverse time order.) I know I can stick a | reverse in there, but I was trying to figure out if there was a better approach that only modifies the "myfields" field, and doesn't require screwing with event order.

(In my non-trivial version of this search, I'm using a transaction command as well, and it has issues when you start messing with time-order. That's just one example of why re-ordering the events is not ideal.)

1 Solution

Paolo_Prigione
Builder

Hi Lowell, I implemented the deduplication and sorting functionality in a custom command. Being your experience far greater than mine you won't have any problem to remove the deduplication logic (and maybe suggest any improvement 😉

Syntax:

| mvdedup [[+|-]fieldname ]*
  • with no parameter: will dedup all the multivalued fields retaining their order
  • with one or more fieldnames: will dedup those fields retaining their order
  • with one or more fieldnames prepended by a +|- (no empty space there!): will dedup and sort ascending/descending

    | mvdedup -id

Here's the configs:

commands.conf

[mvdedup]
type = python
streaming = true
maxinputs = 500000
run_in_preview = true
enableheader = true
retainsevents = true
generating = false
generates_timeorder = false
supports_multivalues = true
supports_getinfo = true

mvdedup.py

import sys
import splunk.Intersplunk as si
import string

def uniqfy(seq,sortorder=None):
    seen = {}
    result = []
    for item in seq:
            if item in seen: continue
            seen[item] = 1
            result.append(item)
    if sortorder=='+':
        result.sort()
    elif sortorder=='-':
        result.sort(reverse=True)
    return result

(isgetinfo, sys.argv) = si.isGetInfo(sys.argv)

if isgetinfo:
    #outputInfo(streaming, generating, retevs, reqsop, preop, timeorder=False):
    si.outputInfo(True, False, True, False, None, False)
    sys.exit(0)


results = si.readResults(None, None, True)

fields={}

if len(sys.argv) > 1:
    for a in sys.argv[1:]:
        a=str(a)
        if a[:1] in ['+','-']:
            # set sorting order
            fields[a[1:]] = a[:1]
        else:
            # no sorting!
            fields[a] = None
else:
    # dedup on all the fields in the data
    for k in results[0].keys():
        fields[k] = None

for i in range(len(results)):
    for key in results[i].keys():
        if(isinstance(results[i][key], list)):    
            if key in fields.keys():
                results[i][key] = uniqfy(results[i][key],fields[key])

si.outputResults(results)

Any suggestion is more than welcome

View solution in original post

Lowell
Super Champion

For anyone following along at home, who hasn't looked in the docs first, ....

There is now a builtin mvdedup eval function.

0 Karma

sideview
SplunkTrust
SplunkTrust

Both mvdedup and mvsort were added as evaluation functions (ie you can use them in eval and where) in 6.2.

| eval mvfield=mvsort(mvfield)

and

| eval mvfield=mvdedup(mvfield)

sideview
SplunkTrust
SplunkTrust

I have the same need, and if I solve it by adding in the "reverse" command before the stats, it reduces my search performance by almost 40%. I'm eager for another answer here that doesnt involve a custom python command.

0 Karma

Paolo_Prigione
Builder

Hi Lowell, I implemented the deduplication and sorting functionality in a custom command. Being your experience far greater than mine you won't have any problem to remove the deduplication logic (and maybe suggest any improvement 😉

Syntax:

| mvdedup [[+|-]fieldname ]*
  • with no parameter: will dedup all the multivalued fields retaining their order
  • with one or more fieldnames: will dedup those fields retaining their order
  • with one or more fieldnames prepended by a +|- (no empty space there!): will dedup and sort ascending/descending

    | mvdedup -id

Here's the configs:

commands.conf

[mvdedup]
type = python
streaming = true
maxinputs = 500000
run_in_preview = true
enableheader = true
retainsevents = true
generating = false
generates_timeorder = false
supports_multivalues = true
supports_getinfo = true

mvdedup.py

import sys
import splunk.Intersplunk as si
import string

def uniqfy(seq,sortorder=None):
    seen = {}
    result = []
    for item in seq:
            if item in seen: continue
            seen[item] = 1
            result.append(item)
    if sortorder=='+':
        result.sort()
    elif sortorder=='-':
        result.sort(reverse=True)
    return result

(isgetinfo, sys.argv) = si.isGetInfo(sys.argv)

if isgetinfo:
    #outputInfo(streaming, generating, retevs, reqsop, preop, timeorder=False):
    si.outputInfo(True, False, True, False, None, False)
    sys.exit(0)


results = si.readResults(None, None, True)

fields={}

if len(sys.argv) > 1:
    for a in sys.argv[1:]:
        a=str(a)
        if a[:1] in ['+','-']:
            # set sorting order
            fields[a[1:]] = a[:1]
        else:
            # no sorting!
            fields[a] = None
else:
    # dedup on all the fields in the data
    for k in results[0].keys():
        fields[k] = None

for i in range(len(results)):
    for key in results[i].keys():
        if(isinstance(results[i][key], list)):    
            if key in fields.keys():
                results[i][key] = uniqfy(results[i][key],fields[key])

si.outputResults(results)

Any suggestion is more than welcome

rizzo75
Path Finder

Thanks for this answer. It helped me answer my question - http://answers.splunk.com/answers/115137/joining-across-field-matrix

0 Karma
Get Updates on the Splunk Community!

New Case Study Shows the Value of Partnering with Splunk Academic Alliance

The University of Nevada, Las Vegas (UNLV) is another premier research institution helping to shape the next ...

How to Monitor Google Kubernetes Engine (GKE)

We’ve looked at how to integrate Kubernetes environments with Splunk Observability Cloud, but what about ...

Index This | How can you make 45 using only 4?

October 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...