Splunk Search

How to group calculated unique values by another field without using a subsearch?

vitorvmiguel
Explorer

Hi folks,

I have a problem. I've done a search displayed below and I'm filtering some types of products (produto). After this, I'm counting Unique Clients (Clientes) and which of these Unique Clients had an error message (Clientes_Impactados). I want to make inside this counting another counting grouping by transaction (programa), and i'm wondering if there is a way of doing that without append or appendcols, to keep the search light.

index="raw_internet" produto="1" OR produto="L" OR produto="2" OR produto="W"  | eval Erros=if(match(tipo,"E"),1,0) | stats sum(Erros) AS ClientesErros by codigoAcesso | stats count AS Clientes, count(eval(ClientesErros>0)) AS Clientes_Impactados |

I hope someone can help me.
Tks

1 Solution

sideview
SplunkTrust
SplunkTrust

Generally in this situation the answer involves switching out a stats clause for an "eventstats" clause. Sometimes in related cases, switching out a stats for a streamstats. Often with some funky evals.

eventstats count sum(foo) by bar basically does the same work as stats count sum(foo) by bar, except that it neglects to also transform, ie group the rows, into the unique values of 'bar'. So instead eventstats gives back the same rows that went into it, just with "count" and "sum(foo)" fields added as appropriate.

I don't know the exact requirements of your programa grouping, but here is a guess as to the sort of thing you're looking for.

index="raw_internet" produto="1" OR produto="L" OR produto="2" OR produto="W"  
| eval Erros=if(match(tipo,"E"),1,0) 
| eventstats sum(Erros) as ClientesErros by codigoAcesso 
| eval impacted_client_name=if(ClientesErros>1,codigoAcesso,null()) 
| streamstats dc(impacted_client_name) as Clientes_Impactados 
| eventstats sum(Erros) as programaErros by programa 
| eval impacted_transactions=if(programaErros>1,programa,null())
| streamstats dc(impacted_transactions) as Programas_Impactados
| stats dc(codigoAcesso) as Clientes, max(Clientes_Impactados) as Clientes_Impactados max(Programas_Impactados) as Programas_Impactados

Here what I'm doing is a little tricky. For both clients and transactions, I'm using eventstats to do the work instead of stats. But then since we don't have transformed rows, we have to be a little clever in getting our Clientes_Impactados. We can't just count the rows anymore. So instead I create a new field that is equal to the client name only if the client has some errors. then I use a streamstats to count the distinct values of that field seen, then at the end we can take the max() of that number. Someone else might find a simpler way this at least illustrates the way of thinking. eventstats/streamstats and eval can often dance around this issue, so you can "group" without grouping and thus effectively group several different contradictory ways simultaneously.

View solution in original post

sideview
SplunkTrust
SplunkTrust

Generally in this situation the answer involves switching out a stats clause for an "eventstats" clause. Sometimes in related cases, switching out a stats for a streamstats. Often with some funky evals.

eventstats count sum(foo) by bar basically does the same work as stats count sum(foo) by bar, except that it neglects to also transform, ie group the rows, into the unique values of 'bar'. So instead eventstats gives back the same rows that went into it, just with "count" and "sum(foo)" fields added as appropriate.

I don't know the exact requirements of your programa grouping, but here is a guess as to the sort of thing you're looking for.

index="raw_internet" produto="1" OR produto="L" OR produto="2" OR produto="W"  
| eval Erros=if(match(tipo,"E"),1,0) 
| eventstats sum(Erros) as ClientesErros by codigoAcesso 
| eval impacted_client_name=if(ClientesErros>1,codigoAcesso,null()) 
| streamstats dc(impacted_client_name) as Clientes_Impactados 
| eventstats sum(Erros) as programaErros by programa 
| eval impacted_transactions=if(programaErros>1,programa,null())
| streamstats dc(impacted_transactions) as Programas_Impactados
| stats dc(codigoAcesso) as Clientes, max(Clientes_Impactados) as Clientes_Impactados max(Programas_Impactados) as Programas_Impactados

Here what I'm doing is a little tricky. For both clients and transactions, I'm using eventstats to do the work instead of stats. But then since we don't have transformed rows, we have to be a little clever in getting our Clientes_Impactados. We can't just count the rows anymore. So instead I create a new field that is equal to the client name only if the client has some errors. then I use a streamstats to count the distinct values of that field seen, then at the end we can take the max() of that number. Someone else might find a simpler way this at least illustrates the way of thinking. eventstats/streamstats and eval can often dance around this issue, so you can "group" without grouping and thus effectively group several different contradictory ways simultaneously.

vitorvmiguel
Explorer

Thank you for the detailed explanation.

0 Karma

chimell
Motivator

Hi vitorvmiguel
Instead of append or appendcols command you can use multisearch command
Like below

     | multisearch [search index="raw_internet" produto="1" OR produto="L" OR produto="2" OR produto="W" |stats count AS Clientes] [search index="raw_internet" produto="1" OR produto="L" OR produto="2" OR produto="W" | eval Erros=if(match(tipo,"E"),1,0) | stats sum(Erros) AS ClientesErros by codigoAcesso | count(eval(ClientesErros>0)) AS Clientes_Impactados]
0 Karma

vitorvmiguel
Explorer

Thank you for your answer chimell but, thats not what i'm looking for, i'm trying to avoid the subsearches. I was wondering if it's possible in the same stats count to count different things and return different results.

rgs

0 Karma
Get Updates on the Splunk Community!

Index This | A sphere has three, a circle has two, and a point has zero. What is it?

September 2023 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

Build Scalable Security While Moving to Cloud - Guide From Clayton Homes

 Clayton Homes faced the increased challenge of strengthening their security posture as they went through ...

Mission Control | Explore the latest release of Splunk Mission Control (2.3)

We’re happy to announce the release of Mission Control 2.3 which includes several new and exciting features ...