Solved: How to combine multiple stats searches and get a r...

splunkuserCA1 · ‎08-19-2020

I have two individual stats searches that return a single value each. How can I combine the two to get a ratio?

The index is basically a table of Transaction IDs. There can be multiple entries for an ID. For example

Transaction ID	Status
txn1	200
txn1	500
txn2	200
txn3	200

Search #1 tells me the number of transactions that ended in an error by looking at the last Status of a transaction ID:

baseSearch | stats latest(status) as lastTxnStatus by txn_id | where lastTxnStatus >= 500 | stats dc(txn_id)

Search #2 tells me the total number of transactions:

baseSearch | stats dc(txn_id)

I want to get a mathematical result of: 100 * Search #1 / Search #2. How can I do that? The trouble I'm having is with the "where" command in Search #1 - that complicates everything. Using the data in the table above, the result would 33.3333% (i.e. 100 * 1/3).

ITWhisperer · ‎08-23-2020

baseSearch | stats latest(status) as lastTxnStatus by txn_id | eval error=if(lastTxnStatus >= 500, 1, 0) | stats sum(error) as errors, dc(txn_id) as TotalTransactions | eval percentageFail = (100 * errors) / TotalTransactions

View solution in original post

splunkuserCA1 · ‎08-19-2020

Got the answer.

search1, modified to rename the column:

baseSearch | stats latest(status) as lastTxnStatus by txn_id | where lastTxnStatus >= 500 | stats dc(txn_id) as TotalFailures

search2, modified to rename the column:

baseSearch | stats dc(txn_id) as TotalValues

Combined:

search1 | append [ search search2] | stats values(TotalFailures) as S1, values(TotalValues) as S2 | eval ratio=round(100*S1/S2, 2)

* Need to use append to combine the searches. But after that, they are in 2 columns over 2 different rows.

* So I need to use "stats" one final time to combine them into a single row with 2 columns.

* Finally, I use eval to get the ratio.

I was hoping there might be a simpler way to do this!

bowesmana · ‎08-23-2020

Simplicity is derived from reducing the two searches to a single searches. There are often several ways to get the same result in Splunk - some more performant than others - which is useful in large data sets.

Here's a variant that uses eventstats to get the unique count of tx ids which before the where clause. It then uses values() to pass that total through to the final eval - it could also be done with 'by TotalTxIds' as there's only one value. As you have already split by txn_id, then count will give you the effective result of dc()

baseSearch 
| stats latest(status) as lastTxnStatus by txn_id 
| eventstats count as TotalTxIds
| where lastTxnStatus>=500
| stats values(TotalTxIds) as TotalTxIds count as TxEndingInError
| eval failureRate = round(TxEndingInError / TotalTxIds * 100)

splunkuserCA1 · ‎08-24-2020

This is a very clever use of stats values()!

I can see that both responses are functionally equivalent and are way better than my original query. I've accepted the first reply. For the record, I also found another efficient way of expressing the query using "top":

basesearch | stats latest(status) as lastStatus by lastTxnStatus
| top lastStatus limit=0
| search lastStatus >= 500
| stats sum(percent) as BadPercentage

bowesmana · ‎08-25-2020

@splunkuserCA1 Haha, yes, you've discovered the beauty of Splunk, in that there is always more than one way of doing a task. At some point in your Splunk journey, you may well start to think about which one performs better than the other and that you can get by looking at the job inspector.

There are definitely performance differences between different techniques and if you have large data sets, you'll start to hit Splunk limits with some techniques.

Happy Splunking!

isoutamo · ‎08-20-2020

Hi

can you post sample data as I suppose that there could be easier solution than those two queries....

r. Ismo

splunkuserCA1 · ‎08-20-2020

While I can't post the exact data, this table should contain enough. As you see, there are multiple entries for some transactions (like txn1), while other transactions have just 1 entry:

Date	Transaction ID	Status
2020-07-01 08:00:00	txn1	200
2020-07-01 08:20:00	txn1	500
2020-07-01 09:00:00	txn2	200
2020-07-01 10:00:00	txn3	200

ITWhisperer · ‎08-23-2020

baseSearch | stats latest(status) as lastTxnStatus by txn_id | eval error=if(lastTxnStatus >= 500, 1, 0) | stats sum(error) as errors, dc(txn_id) as TotalTransactions | eval percentageFail = (100 * errors) / TotalTransactions

How to combine multiple stats searches and get a result?

stats

.conf24 | Registration Open!

ICYMI - Check out the latest releases of Splunk Edge Processor

Introducing the 2024 SplunkTrust!