Splunk Search

mvexpand limits

ITWhisperer
Ultra Champion

There are a couple of issues which often come up with the limits of mvexpand, one of these is the memory limit, the other is that it only applies to one field. I have developed a set of macros which go some way to solving both these issues.

First, I will share the single field version of the macro; then, I will try to explain what each step is doing; finally, I will share the double field version which can easily be expanded to multiple fields. Please excuse the long post.

 

fillnull value="$originull$"
| eval steps=mvcount($field1$)
| streamstats sum(steps) as toprow
| eval maxrow=toprow
| append [| makeresults | eval toprow=1 | fields - _time]
| eventstats min(maxrow) as firsttop
| where isnotnull(maxrow) or toprow != firsttop
| makecontinuous toprow
| reverse
| filldown
| reverse
| eval $field1$=mvindex($field1$,(steps - 1) - (maxrow - toprow))
| fields - maxrow toprow firsttop steps
| foreach * [eval <<FIELD>>=if('<<FIELD>>'="$originull$",null,'<<FIELD>>')]

 

 So, what's going on?

 

fillnull value="$originull$"

 

The first thing to do is fill in the blanks. The macro takes a parameter for this since you would have to decide on a suitable string which doesn't clash with any data you want to keep, because this string is reset to null at the end. The reason for doing this is that filldown is used later and if there are any nulls in the original data, they will get overwritten with whatever happens to be above them.

 

| eval steps=mvcount($field1$)

 

Next, we count how many values are in the multivalue field for each event, i.e. how many events will we need in the expansion. Note that the field name (steps) is arbitrary and if it clashes with a field already in the data, you will lose that field, so change the name used in the macro. This goes for all the fields created by the macro.

 

| streamstats sum(steps) as toprow

 

Now we determine what will be the highest row number for each event.

 

| eval maxrow=toprow

 

We take a copy of this so it is kept for all the expanded rows.

 

| append [| makeresults | eval toprow=1 | fields - _time]

 

We add an additional event with toprow set to 1. This is because makecontinuous used later needs a place to start.

 

| eventstats min(maxrow) as firsttop

 

Now we find out what the lowest maxrow was. Note that the appended event we just added does not have a value for maxrow, so does not interfere with this calculation.

 

| where isnotnull(maxrow) or toprow != firsttop

 

If the original first event had only one row, its maxrow would be 1 and its toprow would also be 1 and the firsttop would also be 1, so, in this instance, we want to remove the event we just added.

 

| makecontinuous toprow

 

Now for the magic - we create events using toprow (which is guaranteed to start at 1) and end with the total number of rows required. Note that at this point, you could run into the rows limit, and you will need to modify your configuration to get around this.

 

| reverse

 

Reverse the order of events (they will have been put into toprow order by the makecontinuous). This is so that filldown copies the right values into the extra events.

 

| filldown

 

Copy all the fields from the events into the new events - you may hit a memory issue here as this is where most memory is taken up because all the multivalue fields are copied into their respective events.

 

| reverse

 

Return the events to original order (possibly not strictly necessary).

 

| eval $field1$=mvindex($field1$,(steps - 1) - (maxrow - toprow))

 

Select one of the multi-values.

 

| fields - maxrow toprow firsttop steps

 

Remove all the extra fields.

 

| foreach * [eval <<FIELD>>=if('<<FIELD>>'="$originull$",null,'<<FIELD>>')]

 

Restore the nulls.

For the multi-mvexpand, steps is evaluated to the maximum number of multivalue fields, and each field has the corresponding entry extracted in the expanded event. This can be extended by adding more fields.

 

fillnull value="$originull$"
| eval steps=max(mvcount($field1$),mvcount($field2$))
| streamstats sum(steps) as toprow
| eval maxrow=toprow
| append [| makeresults | eval toprow=1 | fields - _time]
| eventstats min(maxrow) as firsttop
| where isnotnull(maxrow) or toprow != firsttop
| makecontinuous toprow
| reverse
| filldown
| reverse
| eval $field1$=mvindex($field1$,(steps - 1) - (maxrow - toprow))
| eval $field2$=mvindex($field2$,(steps - 1) - (maxrow - toprow))
| fields - maxrow toprow firsttop steps
| foreach * [eval <<FIELD>>=if('<<FIELD>>'="$originull$",null,'<<FIELD>>')]

 

 

Labels (2)
0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!