Splunk Search

Find value in ordered multivalue range from lookup

Motivator

I have a lookup file containing this sort of data

FieldIntervalScores
FieldName0,15,30,60,300,36005,4,3,2,1,0

 

What I am doing is to look at a field value 'MedDelta' and working out in which range of Interval values it sits between and assigning it the corresponding Score value from the Score field like this

 

| lookup scores.csv Field
| eval Interval=split(Interval,","), Scores=split(Scores,",")
| eval ind=mvmap(Interval, if(MedDelta<=Interval,1,0)) 
| eval ScoreCount=mvcount(Scores) - 1
| eval t=mvfind(ind,"1"), Score=if(isnull(t), mvindex(Scores, ScoreCount, ScoreCount), mvindex(Scores,t-1,t-1))

 

So, if I have MedDelta=10, then Score is 5 (range 0-15), MedDelta is 93, then Score is 2 (range 60-300).

However, mvmap is Splunk 8 and I need to deploy this to Splunk 7.4.

I cannot use mvexpand

Anyone know some cool Splunk trickery to do the same thing without mvmap?

 

Labels (2)
Tags (3)
1 Solution

Communicator

You could use mvsort. In order for this to work you need to add leading zeros to your intervals and medDelta. Try something like

 

| makeresults count=1
| eval interval="0,15,30,60,300,3600"
| eval scores="5,4,3,2,1,0"
| eval meddelta=93
/* add a number of leading zeroes (you may need more depending on your real data) */
| rex mode=sed field=interval "s/(\d+)(,|)/0000\1\2/g"
/* take the last n (5) digits */
| rex mode=sed field=interval "s/(\d*)(\d{5})(,|)/\2\3/g"
| eval interval=split(interval,","), scores=split(scores,",")
/* repeat for meddelta */
| rex mode=sed field=meddelta "s/(\d+)(,|)/0000\1\2/g"
| rex mode=sed field=meddelta "s/(\d*)(\d{5})(,|)/\2\3/g"
/* add meddelta to interval mv field */
| eval interval=mvappend(interval,meddelta)
/* sort (lexicographically) */
| eval interval=mvsort(interval)
/* find out where meddelta ending up */
| eval index=mvfind(interval,meddelta)
/* adjust index for interval boundaries */
| eval index=if(mvindex(interval,index + 1)=meddelta, index, index -1)
/* get corresponding score */
| eval score=mvindex(scores, index)

 

View solution in original post

Communicator

You could use mvsort. In order for this to work you need to add leading zeros to your intervals and medDelta. Try something like

 

| makeresults count=1
| eval interval="0,15,30,60,300,3600"
| eval scores="5,4,3,2,1,0"
| eval meddelta=93
/* add a number of leading zeroes (you may need more depending on your real data) */
| rex mode=sed field=interval "s/(\d+)(,|)/0000\1\2/g"
/* take the last n (5) digits */
| rex mode=sed field=interval "s/(\d*)(\d{5})(,|)/\2\3/g"
| eval interval=split(interval,","), scores=split(scores,",")
/* repeat for meddelta */
| rex mode=sed field=meddelta "s/(\d+)(,|)/0000\1\2/g"
| rex mode=sed field=meddelta "s/(\d*)(\d{5})(,|)/\2\3/g"
/* add meddelta to interval mv field */
| eval interval=mvappend(interval,meddelta)
/* sort (lexicographically) */
| eval interval=mvsort(interval)
/* find out where meddelta ending up */
| eval index=mvfind(interval,meddelta)
/* adjust index for interval boundaries */
| eval index=if(mvindex(interval,index + 1)=meddelta, index, index -1)
/* get corresponding score */
| eval score=mvindex(scores, index)

 

View solution in original post

Motivator

Thanks @ITWhisperer  - I like that approach, as it applies a simple data solution to the problem, not just some clever Splunk technique. I had tried to think a way of using sort/find, but the simplicity of just adding in the actual value to the list and then 'finding', escaped me.

For me, I don't need the final index if statement - index=index-1 works, as the end range is inclusive, so it falls into the higher score value. Also, I can ensure that the interval ranges are pre-formatted, so won't need those rex statements.

Kudos - thanks. I also really like @jeffland solution - that has opened up a massive new world of possibility with the realisation that foreach can be used with numbers, not just existing fields, to create a loop. I never knew it could be used that way - Oh my - that's exciting 🤣

I will accept this solution for its command simplicity, although it has more steps , as the foreach solution, although really rather cool, is using a command that most Splunkers never even get to use.

Champion

hope your csv contains less number of records, hence used join.

| makeresults | eval Field="RAM" , MedDelta=93
| join type=left Field max=0 [| inputlookup test-bes.csv 
| eval Interval=split(Interval,","), Scores=split(Scores,",")
| eval newField=mvzip(Interval,Scores)
| fields - Interval,Scores
| mvexpand newField
| makemv newField delim=","          
| eval Interval=mvindex(newField, 0),Scores=mvindex(newField, 1)
| fields - newField
| streamstats last(Interval) as Interval_Start last(Scores) as last_score by Field current=false
| fields - Scores
| where isnotnull(Interval_Start)
| rename last_score as Scores]
| where  MedDelta>=Interval_Start AND MedDelta<=Interval
————————————
If this helps, give a like below.

Champion

I struggled with the absence of the mvmap function in Splunk before version 8 as well. This is not exactly pretty, but then again, what is.

| makeresults | eval Interval = "0,15,30,60,300,3600", Scores = "5,4,3,2,1,0", MedDelta = "100"
| eval Interval=split(Interval,","), Scores=split(Scores,",")

| eval Score = 0
| foreach 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 [| eval debug_Interval_<<FIELD>> = mvindex(Interval, <<FIELD>>), debug_Scores_<<FIELD>> = mvindex(Scores, <<FIELD>>) | eval Score = if(MedDelta > debug_Interval_<<FIELD>>, debug_Scores_<<FIELD>>, Score)]
| fields - debug_*

What this does is basically create a column for each of your possible intervals and scores, and simply compare the value in MedDelta to each one.

Depending on the maximum number of Intervals, you might have to add more numbers to the foreach loop. 

Motivator

@jeffland You don't realise the possibilities you have opened up with realising that you can basically create a for loop using non existing fields to simulate numbers. I have always used foreach with existing fields  and never realised you could just create the construct you have there.

That is really cool. I've accepted @ITWhisperer solution as it's more flexible, as it doesn't need to know the loop count up front.

I will look at the performance of both options. For a small loop size, in this case I will only have 6 ranges, it will be interesting to see which is more performant.

Using foreach always results in quite hard to understand SPL and it's always a pain in dashboards with its &lt;&lt; construct, but it's macroable, so can be used like this

`calc_score("0 1 2 3 4 5 6")`

where the loop range is passed as a parameter anyway with the macro doing

| foreach $loop$

Kudos to this cool technique.