- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi everyone,
basically I am trying to count how many unique customers I had in a period and that worked well with dc(clientid).
Until I wanted to know "How many unique Customers do I have by product group"
|stats dc(clientid) as Customers by year quarter product_toplevel product_lowerlevel
which, when for example looking at only 1 specific client who purchased 3 products within one group can return entries like
year | quarter | product_toplevel | product_lowerlevel | Customers |
2022 | 1 | Group A | Article 1 | 1 |
2022 | 1 | Group A | Article 2 | 1 |
2022 | 1 | Group A | Article 3 | 1 |
Now Splunk is not wrong here. Looking at each product there is 1 unique customer. However if I were to sum up the Customers-column it would look like I have 3 Customers in total.
I would much rather have it return the value "0.3" for "Customers" in the above example so that I can export the table to excel and work with pivot-tables while retaining the 'correct' total of unique customers.
For that purpose I intend to use eventstats to create a sort of column-total and then divide the value of Customers by that column total.
But I can not figure out how to do it. Any ideas?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Your approach seems reasonable. "Generalized" stats can give you only one overall "division" across your data. So you can't have two different numbers (for example - one "by product" and another "by customer" only using stats alone).
So counting with eventstats does seem the way to go but I don't quite understand why you would like to have 0.3 (I assume you meant 1/3=0.33(3)) customer per row.
The general idea would be something like that:
| eventstats dc(Customers) as Customers count by year
| eval Customers_per_total=Customers/count
But as I said - I fail to see the point in such calculation.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey, thank you very much. Yes, you got it exactly right.
Also for other people who might stumble accros this thread:
When naming fields that are populated with numbers like counted fields it helps to use 1-word names like UniqueCustomers or use underscores like Unique_Customers. If you use "Unique Customers" like I did eval will treat the values of that fields as string even though it is a number and it will not do calculations.
Maybe I oversimplified the explanation of my report but in the full report I group my customers by age, gender, regions, zones within that region, Customer type and Sales Channel. With all that added I fear that some groups of cusotmers might be over- or underrepresented and this calculation aims to compensate for that. Although maybe my thoughts take a wrong turn somewhere.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Your approach seems reasonable. "Generalized" stats can give you only one overall "division" across your data. So you can't have two different numbers (for example - one "by product" and another "by customer" only using stats alone).
So counting with eventstats does seem the way to go but I don't quite understand why you would like to have 0.3 (I assume you meant 1/3=0.33(3)) customer per row.
The general idea would be something like that:
| eventstats dc(Customers) as Customers count by year
| eval Customers_per_total=Customers/count
But as I said - I fail to see the point in such calculation.
