Splunk Search

mvfilter before using mvexpand to reduce memory usage

unbelievable_ma
Explorer

Hi,
I have some documents that looks like this:

 

 

{
  "document_id": "some-id",
  "status": "some-status",
  "fields": "some values",
  "stages": [
    {
      "duration": 0.031,
      "name": "my_name",
      "more_fields": "more_values",
      "array_field": [...],
    },
    ...
  ]
}

 

 

The length of the stages field can be quite large. I would like to calculate the avg or median duration for each type of stage but not for all stage types. Here is what I have initially:

 

 

data_source
    | fields status, stages{}.name as sname, stages{}.duration
    | eval stage_fields=mvzip('stages{}.name', 'stages{}.duration')
    | where job_result in ("some-status")
    | mvexpand stage_fields
    | fields stage_fields
    | rex field=stage_fields "(?<stage_name>.+),(?<stage_duration>.+)"
    | where stage_name in ("my_name", "other_name")
    | timechart span=1h median(stage_duration) as "Median Stage Duration" by stage_name
    | rename stage_name as "Stage Name"

 

 

 This obviously starts truncating results because mvexpand starts expanding into a huge number of fields and complains about memory limits. I tried to put an mvfilter before it so that it only expands those stages that I am interested in but obviously I didn't know how to use it so that ended up as a no op.

So the question is how can I make this query more efficient?
Thanks!

0 Karma

to4kawa
Ultra Champion
 
0 Karma

unbelievable_ma
Explorer

Sorry I don't get it. Could you expand on this a bit?

Thanks!

0 Karma

to4kawa
Ultra Champion
 
0 Karma

unbelievable_ma
Explorer

Well it doesn't really do what I stated in the problem.

Perhaps you could say explains this part?
> stats values(_time) as _time by stages

0 Karma

to4kawa
Ultra Champion
 
0 Karma

unbelievable_ma
Explorer

No _time is not unique because multiple values exist within the same event (hence mvexpand) hence the results are not correct.
I can try to give another example. Lets say I can get this table

idmv_field
1

key1,100

key2,200

key3,300 

2

key1,100

key2,200

key3,300 

3 

key1,100

key2,200

key3,300 

 

Given this I want the result:

keysum
key1300
key2600
key3900

 

The important part here is that the second column is an mv field. mvexpand breaks the memory usage there so I need some other way to accumulate the results. Maybe I will post this as a separate question cause this is perhaps simpler to explain.

Update: mvfilter didn't help with the memory. I found a solution to this that I added here:
https://community.splunk.com/t5/Splunk-Search/Accumulate-values-for-a-multi-value-field-by-key/m-p/5...

0 Karma
Get Updates on the Splunk Community!

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...