Solved: How to calculate time base difference on data?

splunkkid · ‎09-21-2020

Hello,

I am currently struggling with some SPL search command..

I want to show on table about resource's usage data.

The current spl command is like below.

MY_SEARCH_COMMAND 
| eval resource_a_total = a_val * threshold
| eval resource_a_total = b_val * threshold
| stats sum(resource_a_used) as a_used, sum(resource_a_total) as a_total, sum(resource_b_used) as b_used, sum(resource_b_total) as b_total by cluster
| eval a_usage =round(a_used / a_total * 100, 2)
| eval b_usage =round(a_used / a_total * 100, 2)
| table name a_usage b_usage

As you can see, to get usage data, have to calculate each hosts total / used data value and then do aggregation based on same cluster name.

In this situation, I want to add difference of usage data from yesterday to today. If yesterday's resource_a usage is bigger than today, then table should write resource_a_diff like below.

name | a_usage | a_diff | b_usage | b_diff

clusterA | 80 | -5%p | 70 | 5%p

How do I write the statement as efficient way?

ITWhisperer · ‎09-22-2020

You probably need to change the order of the fields you are grouping by in the stats command. You should also remove the first for each cluster. Also, should b_usage be based on a?

MY_SEARCH_COMMAND | bin _time span=1d
| eval resource_a_total = a_val * threshold
| eval resource_a_total = b_val * threshold
| stats sum(resource_a_used) as a_used, sum(resource_a_total) as a_total, sum(resource_b_used) as b_used, sum(resource_b_total) as b_total by cluster _time
| eval a_usage =round(a_used / a_total * 100, 2)
| eval b_usage =round(b_used / b_total * 100, 2)
| autoregress a_usage p=1
| eventstats first(_time) as firsttime by cluster
| where _time != firsttime
| eval a_diff = a_usage - a_usage_p1
| table name a_usage a_diff b_usage

View solution in original post

ITWhisperer · ‎09-21-2020

Obviously, you need to add a time element to your stats with a span of 1d, then you could consider autoregress to get the previous day's stats and do your calculations using those values.

splunkkid · ‎09-21-2020

Hello @ITWhisperer

I am trying to get the results following your instruction.

And I'm pretty not sure if I'm doing it right. I don't get the difference result with below search command

Thanks!

MY_SEARCH_COMMAND | bin _time span=1d
| eval resource_a_total = a_val * threshold
| eval resource_a_total = b_val * threshold
| stats sum(resource_a_used) as a_used, sum(resource_a_total) as a_total, sum(resource_b_used) as b_used, sum(resource_b_total) as b_total by _time cluster
| eval a_usage =round(a_used / a_total * 100, 2)
| eval b_usage =round(a_used / a_total * 100, 2)
| autoregress a_usage p=1
| eval a_diff = a_usage - a_usage_1
| table name a_usage a_diff b_usage

splunkkid · ‎09-22-2020

Hello, @ITWhisperer

Maybe I'm not fully understand about the commands although I read documentation regarding that.

I changed the command as you specified and the result is not quite as I want.

The current result is like below.

name | a_usage | a_diff | b_usage | b_diff

clusterA | 80 | | 70 |

clusterB | 75 | -5%p | 65 | -5%p

Like this, the difference data is not based on time, but based on prior data that written right above of it.

(clusterB 's a_diff is calculated as (clusterB's a_uage - clusterA's a_usage))

Thank you.

ITWhisperer · ‎09-22-2020

You probably need to change the order of the fields you are grouping by in the stats command. You should also remove the first for each cluster. Also, should b_usage be based on a?

MY_SEARCH_COMMAND | bin _time span=1d
| eval resource_a_total = a_val * threshold
| eval resource_a_total = b_val * threshold
| stats sum(resource_a_used) as a_used, sum(resource_a_total) as a_total, sum(resource_b_used) as b_used, sum(resource_b_total) as b_total by cluster _time
| eval a_usage =round(a_used / a_total * 100, 2)
| eval b_usage =round(b_used / b_total * 100, 2)
| autoregress a_usage p=1
| eventstats first(_time) as firsttime by cluster
| where _time != firsttime
| eval a_diff = a_usage - a_usage_p1
| table name a_usage a_diff b_usage

ITWhisperer · ‎09-22-2020

The autoregress field name has a p in it

MY_SEARCH_COMMAND | bin _time span=1d
| eval resource_a_total = a_val * threshold
| eval resource_a_total = b_val * threshold
| stats sum(resource_a_used) as a_used, sum(resource_a_total) as a_total, sum(resource_b_used) as b_used, sum(resource_b_total) as b_total by _time cluster
| eval a_usage =round(a_used / a_total * 100, 2)
| eval b_usage =round(a_used / a_total * 100, 2)
| autoregress a_usage p=1
| eval a_diff = a_usage - a_usage_p1
| table name a_usage a_diff b_usage

How to calculate time base difference on data?

eval

table

ICYMI - Check out the latest releases of Splunk Edge Processor

Introducing the 2024 SplunkTrust!

Introducing the 2024 Splunk MVPs!