Reporting

Calculated the percent difference between two values

Builder

This is a tricky one (or is it?)...

I have indexed Splunk data that looks like this (using multikv):

device_name     host    list(Select)    list(Disk)      difference
vpath0        xyz     19072176        fscsi2/hdisk28  13409 
                        19058767        fscsi3/hdisk56

I'd like to be alerted if the difference between the two "list(Select)" fields is greater than 5%. I know how to do the math on paper, but I can't figure out how to apply it to Splunk. Essentially, I need to divide the "difference" by value of the higher "list(Select)", then multiply by 100.

I've been dabbling with this, but now I'm wondering if it's even possible.

Thanks!

1 Solution

Motivator

You can use the eval command:

your search |  eval percent_difference=(difference/max(list(Select))*100)

Then set up a custom alert condition that hits when percent_difference > 5.

If this doesn't work try renaming your list(Select) to a more friendly name (without parenthesis).

[Edit:] Ok, playing around with your query and my data I noticed that the list(Select) didn't return a numerical value list for me, which caused the calculation of percent_difference to fail. Try adding a tonumber() (added in 4.1.4 or 4.1.5, so you need a recent version for this to work) into your query as such:

sourcetype="datapath-device" earliest=-2h| rex "^DEV#:\s+(?<dev_no>\d+)\s+DEVICE NAME:\s+(?<device_name>\S+)\s+TYPE:\s+(?<type>\d+)\s+POLICY:\s+(?<policy>\S+)" | rex "SERIAL:\s+(?<serial>\S+)" | multikv | stats list(Select) as listSelect, list(Disk), range(Select) as difference by device_name, host | eval percent_difference=((difference/max(tonumber(listSelect)))*100) 

If that doesn't work, use eventstats instead of regular stats as such:

sourcetype="datapath-device" earliest=-2h| rex "^DEV#:\s+(?<dev_no>\d+)\s+DEVICE NAME:\s+(?<device_name>\S+)\s+TYPE:\s+(?<type>\d+)\s+POLICY:\s+(?<policy>\S+)" | rex "SERIAL:\s+(?<serial>\S+)" | multikv | eventstats list(Select) as listSelect, list(Disk), range(Select) as difference by device_name, host | eval percent_difference=((difference/max(tonumber(listSelect)))*100) 

[Edit2] What about this:

sourcetype="datapath-device" earliest=-2h| rex "^DEV#:\s+(?<dev_no>\d+)\s+DEVICE NAME:\s+(?<device_name>\S+)\s+TYPE:\s+(?<type>\d+)\s+POLICY:\s+(?<policy>\S+)" | rex "SERIAL:\s+(?<serial>\S+)" | multikv | stats list(Select) as listSelect, list(Disk), range(Select) as difference by device_name, host | eventstats max(listSelect) as maxSelect | eval percent_difference=((difference/maxSelect)*100)

[EDIT 3:] how about streamstats?

sourcetype="datapath-device" earliest=-2h| rex "^DEV#:\s+(?<dev_no>\d+)\s+DEVICE NAME:\s+(?<device_name>\S+)\s+TYPE:\s+(?<type>\d+)\s+POLICY:\s+(?<policy>\S+)" | rex "SERIAL:\s+(?<serial>\S+)" | multikv | stats list(Select) as listSelect, list(Disk), range(Select) as difference by device_name, host | streamstats max(listSelect) as maxSelect window=1 | eval percent_difference=((difference/maxSelect)*100)

View solution in original post

Motivator

You can use the eval command:

your search |  eval percent_difference=(difference/max(list(Select))*100)

Then set up a custom alert condition that hits when percent_difference > 5.

If this doesn't work try renaming your list(Select) to a more friendly name (without parenthesis).

[Edit:] Ok, playing around with your query and my data I noticed that the list(Select) didn't return a numerical value list for me, which caused the calculation of percent_difference to fail. Try adding a tonumber() (added in 4.1.4 or 4.1.5, so you need a recent version for this to work) into your query as such:

sourcetype="datapath-device" earliest=-2h| rex "^DEV#:\s+(?<dev_no>\d+)\s+DEVICE NAME:\s+(?<device_name>\S+)\s+TYPE:\s+(?<type>\d+)\s+POLICY:\s+(?<policy>\S+)" | rex "SERIAL:\s+(?<serial>\S+)" | multikv | stats list(Select) as listSelect, list(Disk), range(Select) as difference by device_name, host | eval percent_difference=((difference/max(tonumber(listSelect)))*100) 

If that doesn't work, use eventstats instead of regular stats as such:

sourcetype="datapath-device" earliest=-2h| rex "^DEV#:\s+(?<dev_no>\d+)\s+DEVICE NAME:\s+(?<device_name>\S+)\s+TYPE:\s+(?<type>\d+)\s+POLICY:\s+(?<policy>\S+)" | rex "SERIAL:\s+(?<serial>\S+)" | multikv | eventstats list(Select) as listSelect, list(Disk), range(Select) as difference by device_name, host | eval percent_difference=((difference/max(tonumber(listSelect)))*100) 

[Edit2] What about this:

sourcetype="datapath-device" earliest=-2h| rex "^DEV#:\s+(?<dev_no>\d+)\s+DEVICE NAME:\s+(?<device_name>\S+)\s+TYPE:\s+(?<type>\d+)\s+POLICY:\s+(?<policy>\S+)" | rex "SERIAL:\s+(?<serial>\S+)" | multikv | stats list(Select) as listSelect, list(Disk), range(Select) as difference by device_name, host | eventstats max(listSelect) as maxSelect | eval percent_difference=((difference/maxSelect)*100)

[EDIT 3:] how about streamstats?

sourcetype="datapath-device" earliest=-2h| rex "^DEV#:\s+(?<dev_no>\d+)\s+DEVICE NAME:\s+(?<device_name>\S+)\s+TYPE:\s+(?<type>\d+)\s+POLICY:\s+(?<policy>\S+)" | rex "SERIAL:\s+(?<serial>\S+)" | multikv | stats list(Select) as listSelect, list(Disk), range(Select) as difference by device_name, host | streamstats max(listSelect) as maxSelect window=1 | eval percent_difference=((difference/maxSelect)*100)

View solution in original post

Builder

PERFECT!!!!!!!
Thanks AGAIN!

Motivator

hmm looking at this we might be able to use streamstats. http://answers.splunk.com/questions/4142/weirdness-using-max-and-min-in-eval-operating-on-numeric-mu... . lemme update my answer yet again haha

0 Karma

Builder

Hmm... My way works but it's sloppy. I like your Edit2 so much better. Can you think of a way to get maxSelect to report the max of just the two listSelects, and not of all the listSelects for the entire search? I'm playing around but I can't get it... If you have any suggestions, I'm all ears. Still chopping away at it...

0 Karma

Motivator

Glad to help 🙂

0 Karma

Builder

I love how your Edit2 formats it though... If I can get it to grab the correct max(listSelect) setting, that would be golden. I'll post my results here when I figure it out. Thanks again!

0 Karma

Builder

Your third edit above worked sorta... the only problem is that it took the max(listSelect) of all the indexed results, not just the listSelect for that one entry. (I hope that made sense).
Regardless, I think I have the problem fixed now (see comment above). Thanks again, I couldn't have done this with your help and patience.

0 Karma

Builder

GOT IT!
I looked closer at the list of fields... and there was a "Select(n)" field there which corresponded to the values in listSelect. I don't know why both were listed, but I don't particularly care at this point. 🙂
Anyways, I plugged in "Select" instead of "listSelect" and it worked beautifully!
| eval percent_difference=((difference/(max(Select)))*100)

Thank you SO much for your help! 🙂

0 Karma

Motivator

Did another answer edit. What about calculating the max(listSelect) via eventstats before the eval? works in my test..

0 Karma

Motivator

I think part of the problem is the list() command. It looks like max just doesnt work on its output...

0 Karma

Motivator

Forwarder version should not matter.

0 Karma

Builder

And I think part of the problem too is that listSelect isn't a number; it doesn't have that 'n' next to the field. I know we tried wrapping it around that tonumber() function, but for some reason it's not taking.
(My indexer is running 4.1.5, but the forwarder that is capturing this data is running 4.1.4. The forwarder version shouldn't matter, right?)

0 Karma

Motivator

@Branden yes I believe that is pretty safe to say. What about switching max and tonumber? as in eval percent_difference=((difference/tonumber(max(listSelect)))*100

0 Karma

Builder

Okay I think I know what the problem is. I changed the calculation to this for testing purposes:
"| eval percent_difference=(max(listSelect))"
I wanted to see if it could at least pick out the higher of the two listSelects. It can't. Instead, it just lists both listSelects values. It's as if the two listSelects are from two separate searches or something...

0 Karma

Builder

Thank you again for the help.
Unfortunately, neither solution works. "tonumber()" didn't make a difference (I'm running 4.1.5).. and eventstats destroys the formatting of the output, and it still doesn't show percent_difference.
Is it safe to say that the reason it's not working is because it can't do the calculation?

0 Karma

Motivator

@ Branden I edited the answer a bit based on your query since this is rather long...

0 Karma

Builder

If it helps, here's my exact query:

sourcetype="datapath-device" earliest=-2h| rex "^DEV#:\s+(?\d+)\s+DEVICE NAME:\s+(?\S+)\s+TYPE:\s+(?\d+)\s+POLICY:\s+(?\S+)" | rex "SERIAL:\s+(?\S+)" | multikv | stats list(Select) as listSelect, list(Disk), range(Select) as difference by devicename, host | eval percentdifference=((difference/max(listSelect))*100)

0 Karma

Builder

Hmmm.... not showing any results when I do that. I wonder if it's not calculating right...

0 Karma

Motivator

Oh yeah and oops, not sure why i put a - in there instead of a /. Brain fart I guess 🙂

0 Karma

Motivator

@Branden, try adding a | table percent_difference at the end of your search. If it calculates it it should show.

0 Karma