Splunk Search

## How to use (event-)stats and calculations to predict how many unique customers do I have by product group

Engager

Hi everyone,

basically I am trying to count how many unique customers I had in a period and that worked well with dc(clientid).

Until I wanted to know "How many unique Customers do I have by product group"

``|stats dc(clientid) as Customers by year quarter product_toplevel product_lowerlevel``

which, when for example looking at only 1 specific client who purchased 3 products within one group can return entries like

 year quarter product_toplevel product_lowerlevel Customers 2022 1 Group A Article 1 1 2022 1 Group A Article 2 1 2022 1 Group A Article 3 1

Now Splunk is not wrong here. Looking at each product there is 1 unique customer. However if I were to sum up the Customers-column it would look like I have 3 Customers in total.

I would much rather have it return the value "0.3" for "Customers" in the above example so that I can export the table to excel and work with pivot-tables while retaining the 'correct' total of unique customers.

For that purpose I intend to use eventstats to create a sort of column-total and then divide the value of Customers by that column total.

But I can not figure out how to do it. Any ideas?

Labels (2)

• ### stats

1 Solution
SplunkTrust

Your approach seems reasonable. "Generalized" stats can give you only one overall "division" across your data. So you can't have two different numbers (for example - one "by product" and another "by customer" only using stats alone).

So counting with eventstats does seem the way to go but I don't quite understand why you would like to have 0.3 (I assume you meant 1/3=0.33(3)) customer per row.

The general idea would be something like that:

`| eventstats dc(Customers) as Customers count by year| eval Customers_per_total=Customers/count`

But as I said - I fail to see the point in such calculation.

Engager

Hey, thank you very much. Yes, you got it exactly right.

Also for other people who might stumble accros this thread:
When naming fields that are populated with numbers like counted fields it helps to use 1-word names like UniqueCustomers or use underscores like Unique_Customers. If you use "Unique Customers" like I did eval will treat the values of that fields as string even though it is a number and it will not do calculations.

Maybe I oversimplified the explanation of my report but in the full report I group my customers by age, gender, regions, zones within that region, Customer type and Sales Channel.  With all that added I fear that some groups of cusotmers might be over- or underrepresented and this calculation aims to compensate for that. Although maybe my thoughts take a wrong turn somewhere.

SplunkTrust

Your approach seems reasonable. "Generalized" stats can give you only one overall "division" across your data. So you can't have two different numbers (for example - one "by product" and another "by customer" only using stats alone).

So counting with eventstats does seem the way to go but I don't quite understand why you would like to have 0.3 (I assume you meant 1/3=0.33(3)) customer per row.

The general idea would be something like that:

`| eventstats dc(Customers) as Customers count by year| eval Customers_per_total=Customers/count`

But as I said - I fail to see the point in such calculation.

Get Updates on the Splunk Community!

#### Database Performance Sidebar Panel Now on APM Database Query Performance & Service ...

We’ve streamlined the troubleshooting experience for database-related service issues by adding a database ...

#### IM Landing Page Filter - Now Available

We’ve added the capability for you to filter across the summary details on the main Infrastructure Monitoring ...

#### Dynamic Links from Alerts to IM Navigators - New in Observability Cloud

Splunk continues to improve the troubleshooting experience in Observability Cloud with this latest enhancement ...