Splunk Search

breaking multiple mv fields into single events based on array index

pmdba
Builder

I have data that looks something like this, coming in as JSON:

time, application, feature, username, hostname

The problem is that username and hostname are nested arrays, like this:

 

 

{
   application: app1
   feature: feature1
   timestamp: 01/29/2025 23:02:00 +0000
   users: [ 
     { 
       userhost: client1
       username: user1
     }
     { 
       userhost: client2
       username: user2
     }
   ]
}

 

 

and when the event shows up in splunk, userhost and username are converted to multi-value fields.

_timeapplicationfeatureusers{}.usernameusers{}.userhost
01/29/2025 23:02:00app1feature1user1
user2
client1
client2

 

I need an SPL method to convert these into individual events for the purposes of a search, so that I can perform ldap lookups on each hostname. mvexpand only works on one field at a time and doesn't recognize users or users{} as valid input, which loses the relationship between user1:client1 and user2:client2. How can I convert both arrays to individual events by array index, so that I preserve the relationship between username and hostname, like this:

_timeapplicationfeatureusers{}.usernameusers{}.userhost
01/29/2025 23:02:00app1feature1user1client1
01/29/2025 23:02:00app1feature1user2client2
Tags (3)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

First, when illustrating structured data, please post compliant raw text.  In your case, a compliant JSON should be

 

{
   "application": "app1",
   "feature": "feature1",
   "timestamp": "01/29/2025 23:02:00 +0000",
   "users": [ 
     { 
       "userhost": "client1",
       "username": "user1"
     },
     { 
       "userhost": "client2",
       "username": "user2"
     }
   ]
}

 

The trick here is to reach into the JSON array to perform mvexpand and ignore Splunk's default flattening of array.

 

| spath path=users{}
| mvexpand users{}
| spath input=users{}

 

Your sample data will give

applicationfeaturetimestampuserhostusernameusers{}
app1feature101/29/2025 23:02:00 +0000client1user1{ "userhost": "client1", "username": "user1" }
app1feature101/29/2025 23:02:00 +0000client2user2{ "userhost": "client2", "username": "user2" }

Here is an emulation for you to play with and compare with real data

 

| makeresults
| eval _raw = "{
   \"application\": \"app1\",
   \"feature\": \"feature1\",
   \"timestamp\": \"01/29/2025 23:02:00 +0000\",
   \"users\": [ 
     { 
       \"userhost\": \"client1\",
       \"username\": \"user1\"
     },
     { 
       \"userhost\": \"client2\",
       \"username\": \"user2\"
     }
   ]
}"
| spath
``` data emulation above ```

 

 

Tags (1)

bowesmana
SplunkTrust
SplunkTrust

If you want to do this at search time, create a composite field and expand that, e.g.

| eval composite_field=mvzip('users{}.userhost', 'users{}.username', "###")
| fields - users{}*
| mvexpand composite_field
| rex field=composite_field "(?<userhost>.*)###(?<username>.*)"
| fields - composite_field

it will only zip correctly if there are exactly equal elements in each of the MV fields.

 

0 Karma
Get Updates on the Splunk Community!

Leveraging Detections from the Splunk Threat Research Team & Cisco Talos

  Now On Demand  Stay ahead of today’s evolving threats with the combined power of the Splunk Threat Research ...

New in Splunk Observability Cloud: Automated Archiving for Unused Metrics

Automated Archival is a new capability within Metrics Management; which is a robust usage & cost optimization ...

Calling All Security Pros: Ready to Race Through Boston?

Hey Splunkers, .conf25 is heading to Boston and we’re kicking things off with something bold, competitive, and ...