Re: Splitting a large multivalue field into multip...

mjones414 · ‎09-07-2022

I've run into a scenario where when running stats over an index, its possible I can generate a multivalue field with over 11K unique 38 character guid values but it can be as small as 1 38 character guid. I have a need to pass those resulting guids as a string into something that has a character length limit of 999 characters.
Is there a way whereby incrementing by 38 characters, I can split the field into multiple fields up to 950 characters max per field (which should be 25 guids), dynamically since I wont know how many are going to come in at any given time?

bowesmana · ‎09-07-2022

Interesting ...

I'm not sure there's a simple way to do this - but someone will no doubt come up with a one liner.

Here's a stab at an example

| makeresults count=10
``` Generate a random number of 38 character random guids ```
| eval joint_id=random()
| eval count=mvrange(1, random() % 500, 1)
| mvexpand count
| eval guid=printf("%08x-%08x-%08x-%08x%03x", random(), random(), random(), random(), random() % 2048)
``` Now collect these guids back to their common owner ```
| stats values(guid) as guids by joint_id
| eval guid_count=mvcount(guids), len=guid_count * 38
``` Now nested foreach will do up to 20 * each group of 25 guids 
    and create a block_X field, where X is the incrementing group ```
| eval s=0, e=24, b=0
| foreach guids [ foreach 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 [ eval block_{b}=mvindex(<<FIELD>>, s, e), b=b+1, s=e+1, e=e+25 ] ]
``` Now combine all the block_* fields into a single blocks field where
    each 38*25 sequence is joined together as a string ```
| foreach block_* [ eval blocks=mvappend(blocks, mvjoin(<<FIELD>>,"")) ]
| fields - block_*

If you wanted to cater for 11K then the inner foreach would have to be 11000/25=440

The other possible alternative I can see is to use "streamstats c" and then to make 'groups' of these guids of 25 per group. Depending on how you are collecting these guids into their MV equivalent, this may work for you, but assuming you have already the aggregated guids, then you would do

| makeresults count=10
``` Generate a random number of 38 character random guids ```
| eval joint_id=random()
| eval count=mvrange(1, random() % 10000, 1)
| mvexpand count
| eval guid=printf("%08x-%08x-%08x-%08x%03x", random(), random(), random(), random(), random() % 2048)
| streamstats count by joint_id
| eval group=floor((count-1)/25)
``` Now collect these guids back to their common owner ```
| stats values(guid) as guids by joint_id group
| eval guid_count=mvcount(guids)

In your case, you would need to mvexpand the guids, then use streamstats to count by some common ID and then create the group as above, then re-stats to aggregate again by group - then each row has up to 25 guids

Hopefully this gets you somewhere

Splitting a large multivalue field into multiple multivalue fields based on character count?

troubleshooting

using Splunk Enterprise

Data Management Digest – December 2025

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Join the Conversation