Splunk Search

Alternatives to mvexpand for decreasing memory usage

arosenwinkel
Observer

Hello! I have some JSON events that each look something like this:

{
  "id": 12345,
  "steps": [
    {
      "stepName": "A",
      "stepDuration": 0.5
    },
    {
      "stepName": "B",
      "stepDuration": 0.17
    }
  ]
}

My existing searches are set up to do a mvexpand() based on the steps field such that each step becomes its own event which I am able to manipulate. This works great for small numbers of events, but when I am processing thousands of events with 100+ steps each, I am quickly running into the memory limitations imposed on the mvexpand function by default.

Is there an alternative function that I am missing that I can use to compute summary statistics such as "the average duration of step A is X.XX" and "X% of events hit step B"?

If not, is there a better way to structure the events themselves to support this? My constraint is that I need to allow for arbitrary numbers of steps occurring in an arbitrary order that needs to be preserved.

Thanks in advance!

Labels (1)
Tags (1)
0 Karma

arosenwinkel
Observer

So I feel like an idiot - my solution ended up being as simple as adding a

| fields x | fields - _raw

 prior to the mvexpand x....

Appreciate the help, though, @ITWhisperer ! I'll keep tinkering with your solution because it is very weird that it only was grabbing some of the steps.

0 Karma

ITWhisperer
Ultra Champion

A bit convoluted and I don't know if it solves the memory issue, but try something like this:

 

| makeresults | eval _raw="{
  \"id\": 12345,
  \"steps\": [
    {
      \"stepName\": \"A\",
      \"stepDuration\": 0.5
    },
    {
      \"stepName\": \"B\",
      \"stepDuration\": 0.17
    }
  ]
}|{
  \"id\": 23456,
  \"steps\": [
    {
      \"stepName\": \"A\",
      \"stepDuration\": 0.7
    },
    {
      \"stepName\": \"C\",
      \"stepDuration\": 0.17
    }
  ]
}|{
  \"id\": 34567,
  \"steps\": [
    {
      \"stepName\": \"A\",
      \"stepDuration\": 0.9
    },
    {
      \"stepName\": \"B\",
      \"stepDuration\": 0.15
    },
    {
      \"stepName\": \"C\",
      \"stepDuration\": 0.19
    }
  ]
}"
| eval events=split(_raw,"|")
| mvexpand events
| eval _raw=events
| fields - _time events
| spath steps{} output=names
| fields - _raw
| eval durations=names
| rex field=names mode=sed "s/[\S\s]+\"stepName\":\s\"([^\"]+)[\S\s]+/\1/g"
| rex field=durations mode=sed "s/[\S\s]+\"stepDuration\":\s([\d\.]+)[\S\s]+/\1/g"
| streamstats count as row
| eval steps=mvcount(names)
| streamstats sum(steps) as toprow
| eval maxrow=toprow
| makecontinuous toprow
| reverse
| filldown
| eval toprow=if(row=1,1,toprow)
| makecontinuous toprow
| filldown
| eval names=mvindex(names,maxrow-toprow)
| eval durations=mvindex(durations,maxrow-toprow)
| fields - maxrow toprow row steps
| stats avg(durations) as average_duration count by names

 

 

0 Karma

arosenwinkel
Observer

Very cool - I will give this a try. My understanding is that this is basically doing the dirty work of mvexpand, but in a way that Splunk can hopefully do without blowing up every event at the same time?

Thanks!

0 Karma

ITWhisperer
Ultra Champion

Essentially what it is doing is working out how many rows are required by each multi-valued set, then adding additional empty rows. The order is then reversed so that filldown will copy the missing values into each row. This doesn't work properly for the first row if it has more than 1 multi-value, so it sets the row number to 1 and adds the missing rows and fills them in. Then, because each row has all the multi values from the original row, it select just one of the values for each row. The final part is an example stats across the expanded set of events.

I don't know if mvexpand works internally like this (I doubt it), but this seems to simulate the effect of mvexpand, hopefully in a less memory hungry manner.

I would be interested in your experience as I have not tried this at scale.

0 Karma

arosenwinkel
Observer

So I think this answer is verrrrry close - it is reporting the correct average durations (without triggering any memory usage warnings), but only for three of the steps!

Looking at the search it is very unclear why this would be the case.... I'll keep playing around with it because there has to be something dumb that I'm doing

0 Karma
.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!