Splunk Search

Is it possible to sort or reorder a multivalue field?

Super Champion

Anyone have any thoughts as to how to reorder a multi-valued field? Ideally I'd like to be able to do a "sort" or in my specific use case, a "reverse" would be perfect.

Say you have the following search:

my search | stats list(myfield) as myfields by id

The list() stats operator preserves all values of "myfield" in the events and preserves order, which is what I want. However, I'd really like to see the values of "myfield" in time order (not reverse time order.) I know I can stick a | reverse in there, but I was trying to figure out if there was a better approach that only modifies the "myfields" field, and doesn't require screwing with event order.

(In my non-trivial version of this search, I'm using a transaction command as well, and it has issues when you start messing with time-order. That's just one example of why re-ordering the events is not ideal.)

1 Solution

Hi Lowell, I implemented the deduplication and sorting functionality in a custom command. Being your experience far greater than mine you won't have any problem to remove the deduplication logic (and maybe suggest any improvement 😉

Syntax:

| mvdedup [[+|-]fieldname ]*
  • with no parameter: will dedup all the multivalued fields retaining their order
  • with one or more fieldnames: will dedup those fields retaining their order
  • with one or more fieldnames prepended by a +|- (no empty space there!): will dedup and sort ascending/descending

    | mvdedup -id

Here's the configs:

commands.conf

[mvdedup]
type = python
streaming = true
maxinputs = 500000
run_in_preview = true
enableheader = true
retainsevents = true
generating = false
generates_timeorder = false
supports_multivalues = true
supports_getinfo = true

mvdedup.py

import sys
import splunk.Intersplunk as si
import string

def uniqfy(seq,sortorder=None):
    seen = {}
    result = []
    for item in seq:
            if item in seen: continue
            seen[item] = 1
            result.append(item)
    if sortorder=='+':
        result.sort()
    elif sortorder=='-':
        result.sort(reverse=True)
    return result

(isgetinfo, sys.argv) = si.isGetInfo(sys.argv)

if isgetinfo:
    #outputInfo(streaming, generating, retevs, reqsop, preop, timeorder=False):
    si.outputInfo(True, False, True, False, None, False)
    sys.exit(0)


results = si.readResults(None, None, True)

fields={}

if len(sys.argv) > 1:
    for a in sys.argv[1:]:
        a=str(a)
        if a[:1] in ['+','-']:
            # set sorting order
            fields[a[1:]] = a[:1]
        else:
            # no sorting!
            fields[a] = None
else:
    # dedup on all the fields in the data
    for k in results[0].keys():
        fields[k] = None

for i in range(len(results)):
    for key in results[i].keys():
        if(isinstance(results[i][key], list)):    
            if key in fields.keys():
                results[i][key] = uniqfy(results[i][key],fields[key])

si.outputResults(results)

Any suggestion is more than welcome

View solution in original post

Super Champion

For anyone following along at home, who hasn't looked in the docs first, ....

There is now a builtin mvdedup eval function.

0 Karma

SplunkTrust
SplunkTrust

Both mvdedup and mvsort were added as evaluation functions (ie you can use them in eval and where) in 6.2.

| eval mvfield=mvsort(mvfield)

and

| eval mvfield=mvdedup(mvfield)

SplunkTrust
SplunkTrust

I have the same need, and if I solve it by adding in the "reverse" command before the stats, it reduces my search performance by almost 40%. I'm eager for another answer here that doesnt involve a custom python command.

0 Karma

Hi Lowell, I implemented the deduplication and sorting functionality in a custom command. Being your experience far greater than mine you won't have any problem to remove the deduplication logic (and maybe suggest any improvement 😉

Syntax:

| mvdedup [[+|-]fieldname ]*
  • with no parameter: will dedup all the multivalued fields retaining their order
  • with one or more fieldnames: will dedup those fields retaining their order
  • with one or more fieldnames prepended by a +|- (no empty space there!): will dedup and sort ascending/descending

    | mvdedup -id

Here's the configs:

commands.conf

[mvdedup]
type = python
streaming = true
maxinputs = 500000
run_in_preview = true
enableheader = true
retainsevents = true
generating = false
generates_timeorder = false
supports_multivalues = true
supports_getinfo = true

mvdedup.py

import sys
import splunk.Intersplunk as si
import string

def uniqfy(seq,sortorder=None):
    seen = {}
    result = []
    for item in seq:
            if item in seen: continue
            seen[item] = 1
            result.append(item)
    if sortorder=='+':
        result.sort()
    elif sortorder=='-':
        result.sort(reverse=True)
    return result

(isgetinfo, sys.argv) = si.isGetInfo(sys.argv)

if isgetinfo:
    #outputInfo(streaming, generating, retevs, reqsop, preop, timeorder=False):
    si.outputInfo(True, False, True, False, None, False)
    sys.exit(0)


results = si.readResults(None, None, True)

fields={}

if len(sys.argv) > 1:
    for a in sys.argv[1:]:
        a=str(a)
        if a[:1] in ['+','-']:
            # set sorting order
            fields[a[1:]] = a[:1]
        else:
            # no sorting!
            fields[a] = None
else:
    # dedup on all the fields in the data
    for k in results[0].keys():
        fields[k] = None

for i in range(len(results)):
    for key in results[i].keys():
        if(isinstance(results[i][key], list)):    
            if key in fields.keys():
                results[i][key] = uniqfy(results[i][key],fields[key])

si.outputResults(results)

Any suggestion is more than welcome

View solution in original post

Path Finder

Thanks for this answer. It helped me answer my question - http://answers.splunk.com/answers/115137/joining-across-field-matrix

0 Karma