About unbelievable_ma

unbelievable_ma · ‎08-27-2020

Unfortunately I can't do that. I don't have a lot of different keys. Around 10-12. I ended up solving this by splitting the key manually into separate columns and then aggregating without using mvexpand. Something like: | fields name, value | eval stages=mvzip(name, value) -- the sort helps here to make sure the keys appear always in the same order | eval stages=mvsort(mvfilter(match(stages, "key-*")) | eval key0=mvindex(stages,0) | rex field=key0 "(?<name>.+),(?<key0>.+)" | eval key1=mvindex(stages,0) | rex field=key1 "(?<name>.+),(?<key1>.+)" -- similarly for all different keys | timechart span=1h median(key1) as key1 median(key2) as key2 -- other keys here This expands to exactly the same amount of data as before but no memory issues.

unbelievable_ma · ‎08-26-2020

`mvexpand stages` blows up memory unfortunately.

unbelievable_ma · ‎08-26-2020

Hi, Let's say I can get this table using some Splunk query. id stages 1 key1,100 key2,200 key3,300 2 key1,50 key2,150 key3,250 3 key1,150 key2,250 key3,350 Given this data I want the result, that is I want to reduce (average) over the keys. key avg key1 100 key2 200 key3 300 I tried to use mvexpand for this but Splunk runs out of memory and the results get truncated. So I want something more like a reduce function that can accumulate this mv field by key. Is this possible to do through a splunk query? Here is what I have tried: `data_source` | fields id, stages{}.name as stage_name, somejson{}.duration as stage_duration | eval stages=mvzip(stage_name, stage_duration) | eval stages=mvfilter(match(stages, "key*")) | mvexpand stages | eval stages=split(stages, ",") | eval stage_name=mvindex(stages,0) | eval stage_duration=mvindex(stages,1) | stats avg(stage_duration) by stage_name I want to do something more efficient than `mvexpand stages` that helps me do the reduction without blowing up memory.

unbelievable_ma · ‎08-26-2020

No _time is not unique because multiple values exist within the same event (hence mvexpand) hence the results are not correct. I can try to give another example. Lets say I can get this table id mv_field 1 key1,100 key2,200 key3,300 2 key1,100 key2,200 key3,300 3 key1,100 key2,200 key3,300 Given this I want the result: key sum key1 300 key2 600 key3 900 The important part here is that the second column is an mv field. mvexpand breaks the memory usage there so I need some other way to accumulate the results. Maybe I will post this as a separate question cause this is perhaps simpler to explain. Update: mvfilter didn't help with the memory. I found a solution to this that I added here: https://community.splunk.com/t5/Splunk-Search/Accumulate-values-for-a-multi-value-field-by-key/m-p/516577#M145195

unbelievable_ma · ‎08-26-2020

Well it doesn't really do what I stated in the problem. Perhaps you could say explains this part? > stats values(_time) as _time by stages

unbelievable_ma · ‎08-25-2020

Sorry I don't get it. Could you expand on this a bit? Thanks!

unbelievable_ma · ‎08-25-2020

Hi, I have some documents that looks like this: { "document_id": "some-id", "status": "some-status", "fields": "some values", "stages": [ { "duration": 0.031, "name": "my_name", "more_fields": "more_values", "array_field": [...], }, ... ] } The length of the stages field can be quite large. I would like to calculate the avg or median duration for each type of stage but not for all stage types. Here is what I have initially: data_source | fields status, stages{}.name as sname, stages{}.duration | eval stage_fields=mvzip('stages{}.name', 'stages{}.duration') | where job_result in ("some-status") | mvexpand stage_fields | fields stage_fields | rex field=stage_fields "(?<stage_name>.+),(?<stage_duration>.+)" | where stage_name in ("my_name", "other_name") | timechart span=1h median(stage_duration) as "Median Stage Duration" by stage_name | rename stage_name as "Stage Name" This obviously starts truncating results because mvexpand starts expanding into a huge number of fields and complains about memory limits. I tried to put an mvfilter before it so that it only expands those stages that I am interested in but obviously I didn't know how to use it so that ended up as a no op. So the question is how can I make this query more efficient? Thanks!

Posts	7
Solutions	1
Karma Given	0
Karma Received	0
Member Since	‎08-25-2020

Online Status	Offline
Date Last Visited	‎08-27-2020 05:14 PM

Accumulate values for a multi value field by key

mvfilter before using mvexpand to reduce memory us...

Re: Accumulate values for a multi value field by k...

Re: Accumulate values for a multi value field by k...

Accumulate values for a multi value field by key

Re: mvfilter before using mvexpand to reduce memor...

Re: mvfilter before using mvexpand to reduce memor...

Re: mvfilter before using mvexpand to reduce memor...

mvfilter before using mvexpand to reduce memory us...