Splunk Search

breaking multiple mv fields into single events based on array index

pmdba
Builder

I have data that looks something like this, coming in as JSON:

time, application, feature, username, hostname

The problem is that username and hostname are nested arrays, like this:

 

 

{
   application: app1
   feature: feature1
   timestamp: 01/29/2025 23:02:00 +0000
   users: [ 
     { 
       userhost: client1
       username: user1
     }
     { 
       userhost: client2
       username: user2
     }
   ]
}

 

 

and when the event shows up in splunk, userhost and username are converted to multi-value fields.

_timeapplicationfeatureusers{}.usernameusers{}.userhost
01/29/2025 23:02:00app1feature1user1
user2
client1
client2

 

I need an SPL method to convert these into individual events for the purposes of a search, so that I can perform ldap lookups on each hostname. mvexpand only works on one field at a time and doesn't recognize users or users{} as valid input, which loses the relationship between user1:client1 and user2:client2. How can I convert both arrays to individual events by array index, so that I preserve the relationship between username and hostname, like this:

_timeapplicationfeatureusers{}.usernameusers{}.userhost
01/29/2025 23:02:00app1feature1user1client1
01/29/2025 23:02:00app1feature1user2client2
Labels (1)
Tags (3)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

First, when illustrating structured data, please post compliant raw text.  In your case, a compliant JSON should be

 

{
   "application": "app1",
   "feature": "feature1",
   "timestamp": "01/29/2025 23:02:00 +0000",
   "users": [ 
     { 
       "userhost": "client1",
       "username": "user1"
     },
     { 
       "userhost": "client2",
       "username": "user2"
     }
   ]
}

 

The trick here is to reach into the JSON array to perform mvexpand and ignore Splunk's default flattening of array.

 

| spath path=users{}
| mvexpand users{}
| spath input=users{}

 

Your sample data will give

applicationfeaturetimestampuserhostusernameusers{}
app1feature101/29/2025 23:02:00 +0000client1user1{ "userhost": "client1", "username": "user1" }
app1feature101/29/2025 23:02:00 +0000client2user2{ "userhost": "client2", "username": "user2" }

Here is an emulation for you to play with and compare with real data

 

| makeresults
| eval _raw = "{
   \"application\": \"app1\",
   \"feature\": \"feature1\",
   \"timestamp\": \"01/29/2025 23:02:00 +0000\",
   \"users\": [ 
     { 
       \"userhost\": \"client1\",
       \"username\": \"user1\"
     },
     { 
       \"userhost\": \"client2\",
       \"username\": \"user2\"
     }
   ]
}"
| spath
``` data emulation above ```

 

 

Tags (1)

bowesmana
SplunkTrust
SplunkTrust

If you want to do this at search time, create a composite field and expand that, e.g.

| eval composite_field=mvzip('users{}.userhost', 'users{}.username', "###")
| fields - users{}*
| mvexpand composite_field
| rex field=composite_field "(?<userhost>.*)###(?<username>.*)"
| fields - composite_field

it will only zip correctly if there are exactly equal elements in each of the MV fields.

 

0 Karma
Get Updates on the Splunk Community!

Technical Workshop Series: Splunk Data Management and SPL2 | Register here!

Hey, Splunk Community! Ready to take your data management skills to the next level? Join us for a 3-part ...

Spotting Financial Fraud in the Haystack: A Guide to Behavioral Analytics with Splunk

In today's digital financial ecosystem, security teams face an unprecedented challenge. The sheer volume of ...

Solve Problems Faster with New, Smarter AI and Integrations in Splunk Observability

Solve Problems Faster with New, Smarter AI and Integrations in Splunk Observability As businesses scale ...