Splunk Search

Accumulate values for a multi value field by key

unbelievable_ma
Explorer

Hi,

Let's say I can get this table using some Splunk query.

idstages
1

key1,100

key2,200

key3,300 

2

key1,50

key2,150

key3,250

3 

key1,150

key2,250

key3,350

 

Given this data I want the result, that is I want to reduce (average) over the keys.

keyavg
key1100
key2200
key3300

 

I tried to use mvexpand for this but Splunk runs out of memory and the results get truncated. So I want something more like a reduce function that can accumulate this mv field by key. Is this possible to do through a splunk query?

Here is what I have tried:

 

`data_source`
    | fields id, stages{}.name as stage_name, somejson{}.duration as stage_duration
    | eval stages=mvzip(stage_name, stage_duration)
    | eval stages=mvfilter(match(stages, "key*"))
    | mvexpand stages
    | eval stages=split(stages, ",")
    | eval stage_name=mvindex(stages,0)
    | eval stage_duration=mvindex(stages,1)
    | stats avg(stage_duration) by stage_name

 

I want to do something more efficient than `mvexpand stages` that helps me do the reduction without blowing up memory.

Labels (3)
Tags (2)
0 Karma
1 Solution

unbelievable_ma
Explorer

Unfortunately I can't do that. I don't have a lot of different keys. Around 10-12.
I ended up solving this by splitting the key manually into separate columns and then aggregating without using mvexpand. Something like:

| fields name, value
| eval stages=mvzip(name, value)
-- the sort helps here to make sure the keys appear always in the same order
| eval stages=mvsort(mvfilter(match(stages, "key-*"))
| eval key0=mvindex(stages,0)
| rex field=key0 "(?<name>.+),(?<key0>.+)"
| eval key1=mvindex(stages,0)
| rex field=key1 "(?<name>.+),(?<key1>.+)"
-- similarly for all different keys
| timechart span=1h
    median(key1) as key1
    median(key2) as key2
    -- other keys here

This expands to exactly the same amount of data as before but no memory issues.

View solution in original post

0 Karma

thambisetty
Super Champion

| eventstats count

| mvexpand stages

| rex field=stages “(?<key>[^,]+),(?<value>\d+)”

| stats values(count) as count sum(value) as value by key

| eval avg=value/count

| table key avg

Note: replace double quotes in rex command with double quotes from your keyboard.

not tested also.

————————————
If this helps, give a like below.
0 Karma

unbelievable_ma
Explorer

`mvexpand stages` blows up memory unfortunately.

0 Karma

thambisetty
Super Champion

Yes. If you are applying on raw events. Try to apply after transforming commands like top,chart, stats, timechart. This will reduce number of events.

————————————
If this helps, give a like below.
0 Karma

unbelievable_ma
Explorer

Unfortunately I can't do that. I don't have a lot of different keys. Around 10-12.
I ended up solving this by splitting the key manually into separate columns and then aggregating without using mvexpand. Something like:

| fields name, value
| eval stages=mvzip(name, value)
-- the sort helps here to make sure the keys appear always in the same order
| eval stages=mvsort(mvfilter(match(stages, "key-*"))
| eval key0=mvindex(stages,0)
| rex field=key0 "(?<name>.+),(?<key0>.+)"
| eval key1=mvindex(stages,0)
| rex field=key1 "(?<name>.+),(?<key1>.+)"
-- similarly for all different keys
| timechart span=1h
    median(key1) as key1
    median(key2) as key2
    -- other keys here

This expands to exactly the same amount of data as before but no memory issues.

View solution in original post

0 Karma
.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!