Splunk Search

Accumulate values for a multi value field by key

unbelievable_ma
Explorer

Hi,

Let's say I can get this table using some Splunk query.

idstages
1

key1,100

key2,200

key3,300 

2

key1,50

key2,150

key3,250

3 

key1,150

key2,250

key3,350

 

Given this data I want the result, that is I want to reduce (average) over the keys.

keyavg
key1100
key2200
key3300

 

I tried to use mvexpand for this but Splunk runs out of memory and the results get truncated. So I want something more like a reduce function that can accumulate this mv field by key. Is this possible to do through a splunk query?

Here is what I have tried:

 

`data_source`
    | fields id, stages{}.name as stage_name, somejson{}.duration as stage_duration
    | eval stages=mvzip(stage_name, stage_duration)
    | eval stages=mvfilter(match(stages, "key*"))
    | mvexpand stages
    | eval stages=split(stages, ",")
    | eval stage_name=mvindex(stages,0)
    | eval stage_duration=mvindex(stages,1)
    | stats avg(stage_duration) by stage_name

 

I want to do something more efficient than `mvexpand stages` that helps me do the reduction without blowing up memory.

Labels (3)
Tags (2)
0 Karma
1 Solution

unbelievable_ma
Explorer

Unfortunately I can't do that. I don't have a lot of different keys. Around 10-12.
I ended up solving this by splitting the key manually into separate columns and then aggregating without using mvexpand. Something like:

| fields name, value
| eval stages=mvzip(name, value)
-- the sort helps here to make sure the keys appear always in the same order
| eval stages=mvsort(mvfilter(match(stages, "key-*"))
| eval key0=mvindex(stages,0)
| rex field=key0 "(?<name>.+),(?<key0>.+)"
| eval key1=mvindex(stages,0)
| rex field=key1 "(?<name>.+),(?<key1>.+)"
-- similarly for all different keys
| timechart span=1h
    median(key1) as key1
    median(key2) as key2
    -- other keys here

This expands to exactly the same amount of data as before but no memory issues.

View solution in original post

0 Karma

thambisetty
Super Champion

| eventstats count

| mvexpand stages

| rex field=stages “(?<key>[^,]+),(?<value>\d+)”

| stats values(count) as count sum(value) as value by key

| eval avg=value/count

| table key avg

Note: replace double quotes in rex command with double quotes from your keyboard.

not tested also.

————————————
If this helps, give a like below.
0 Karma

unbelievable_ma
Explorer

`mvexpand stages` blows up memory unfortunately.

0 Karma

thambisetty
Super Champion

Yes. If you are applying on raw events. Try to apply after transforming commands like top,chart, stats, timechart. This will reduce number of events.

————————————
If this helps, give a like below.
0 Karma

unbelievable_ma
Explorer

Unfortunately I can't do that. I don't have a lot of different keys. Around 10-12.
I ended up solving this by splitting the key manually into separate columns and then aggregating without using mvexpand. Something like:

| fields name, value
| eval stages=mvzip(name, value)
-- the sort helps here to make sure the keys appear always in the same order
| eval stages=mvsort(mvfilter(match(stages, "key-*"))
| eval key0=mvindex(stages,0)
| rex field=key0 "(?<name>.+),(?<key0>.+)"
| eval key1=mvindex(stages,0)
| rex field=key1 "(?<name>.+),(?<key1>.+)"
-- similarly for all different keys
| timechart span=1h
    median(key1) as key1
    median(key2) as key2
    -- other keys here

This expands to exactly the same amount of data as before but no memory issues.

0 Karma
Get Updates on the Splunk Community!

Security Highlights: September 2022 Newsletter

 September 2022 The Splunk App for Fraud Analytics (SFA) is now Splunk SupportedUse your existing Splunk ...

Platform Highlights | September 2022 Newsletter

 September 2022 What’s New in 9.0 and How to UpgradeGet a walk through of what is new Splunk Enterprise 9.0 ...

Observability Highlights | September 2022 Newsletter

 September 2022 Splunk Observability SuiteAccess to "Classic" SignalFx Interface Will be Removed on Sept 30, ...