Splunk Search

Is it possible to assign an index value when using rex to extract multiple values for a field?

jambajuice
Communicator

Let's say I'm trying to extract a multivalue field using rex that looks like this:

script_id(10),vuln_id(23435,123,4567,3456)

If I use rex to extract those numbers into a multivalued field and display them in a table, I'll see the vuln_id numbers listed in the order that they were extracted. How can I assign something like an index number to those events so that a table like the following is produced:

10,23453,1
10,123,2
10,4567,3
10,3456,4

Thanks.

Craig

Tags (1)

Ron_Naken
Splunk Employee
Splunk Employee

Here's a quick "hack" to make it work. Paolo did some work to build a .py that will deduplicate multi-valued fields, as well as sort them. You can find his reference here:

http://answers.splunk.com/questions/11394/is-it-possible-to-sort-or-reorder-a-multivalue-field/11434

Use his instructions to install the mvdedup command, and use the following "hack" as your mvdedup.py to have it add an index value (as ",[idx]") to each of the multi-valued items.

NOTE: This version of mvdedup will preserve the absolute index, even if you choose to sort the multivalued field.

Run the following search command to produce the results, as you listed them, from your sample data:

sourcetype="mvindex"  | rex field=_raw "script_id\((?<script_id>[^\)]+)\),vuln_id\((?<vuln_id>[^\)]+)\)" | makemv delim="," vuln_id | mvdedup +vuln_id | mvexpand vuln_id | eval tmp = tostring(script_id) + "," + tostring(vuln_id) | table tmp

The modified version of Paolo's MVDEDUP.PY:

import sys
import splunk.Intersplunk as si
import string

def uniqfy(seq,sortorder=None):
    seen = {}
    result = []
    i = 1
    for item in seq:
            if item in seen: continue
            seen[item] = 1
            result.append(item + "," + str(i))
            i += 1
    if sortorder=='+':
        result.sort()
    elif sortorder=='-':
        result.sort(reverse=True)
    return result

(isgetinfo, sys.argv) = si.isGetInfo(sys.argv)

if isgetinfo:
    #outputInfo(streaming, generating, retevs, reqsop, preop, timeorder=False):
    si.outputInfo(True, False, True, False, None, False)
    sys.exit(0)


results = si.readResults(None, None, True)

fields={}

if len(sys.argv) > 1:
    for a in sys.argv[1:]:
        a=str(a)
        if a[:1] in ['+','-']:
            # set sorting order
            fields[a[1:]] = a[:1]
        else:
            # no sorting!
            fields[a] = None
else:
    # dedup on all the fields in the data
    for k in results[0].keys():
        fields[k] = None

for i in range(len(results)):
    for key in results[i].keys():
        if(isinstance(results[i][key], list)):    
            if key in fields.keys():
                results[i][key] = uniqfy(results[i][key],fields[key])

si.outputResults(results)
Get Updates on the Splunk Community!

Leveraging Detections from the Splunk Threat Research Team & Cisco Talos

  Now On Demand  Stay ahead of today’s evolving threats with the combined power of the Splunk Threat Research ...

New in Splunk Observability Cloud: Automated Archiving for Unused Metrics

Automated Archival is a new capability within Metrics Management; which is a robust usage & cost optimization ...

Calling All Security Pros: Ready to Race Through Boston?

Hey Splunkers, .conf25 is heading to Boston and we’re kicking things off with something bold, competitive, and ...