Splunk Search

How can I identify the longest string in a multivalued field?

responsys_cm
Builder

I'm trying to make the Linux audit daemon data play nice. One of the challenges is that a particular action can trigger anywhere from one event to half a dozen (all with the same event ID, but each with their own line).

If I want to figure out what the command was that a user actually ran after using transaction to group those events into a single transaction, I might get something like this for the cmd field:

/bin/bash
/bin/bash service
/bin/bash service networking restart

If that was the content of a multivalued field, what is the best approach for filtering out the first two values?

Thanks.

Craig

Tags (1)
0 Karma

gabriel_vasseur
Contributor

2 Solutions.

First one uses only commands that should be in older versions of splunk:

 

| makeresults 
| eval test=split("abc,defgh,a,asdfasdfasdfasdf,igasfasd", ",")
| eval other_important_field="blah"
| mvexpand test
| eval length=len(test)
| eventstats max(length) as max_length, min(length) as min_length
| eval longest=if(length==max_length, test, null() ),  shortest=if(length==min_length, test, null() )
| stats values(longest) as longest,  values(shortest) as shortest,  values(test) as test by _time other_important_field

 

 

If like me you don't like the idea of using mvexpand (for instance because in some cases your multivalue can be empty) you can use this alternative: It's using the newish mvmap command to massage the multivalue and then the min/max statistical function that works with strings using alphabetical order. The use of printf ensures alphabetical and numerical order are the same.

 

| makeresults 
| eval test=split("abc,defgh,a,asdfasdfasdfasdf,igasfasd", ",")
| eval test2=mvmap(test, printf("%05d", len(test) ) . " - " . test)
| eval shortest=min(test2), longest=max(test2)
| eval shortest=replace(shortest, "^\d+ - ", "" ), longest=replace(longest, "^\d+ - ", "" )

Hope this helps.

0 Karma

kristian_kolb
Ultra Champion

You can do it without using a transaction at all; the len() function of eval may be used;

sourcetype=auditd | eval cmdsize=len(cmd) | sort -cmdsize | dedup eventID  | table eventID cmd uid _time whatever

Have not tested it due (no Splunk in front of me right now), but it should work. First you calculate the length of the cmd field in each event, then sort the events (descending) based on size, then keep only the first event for each eventID that is seen (which should have the highest value of cmdsize for that eventID). Table the results as you need/want.

Hope this helps,

K

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...