- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Hello guys,
I have the following syntax and data:
However, there is a discrepancy with the total count per category.
For example,
Password Reset should be 3106 (when I manually count it) but in the screenshot provided as you can see, the total count only has 3007 😞 am i missing something?
in the second screenshot, as you can see, it displays the correct count; however, it has duplicated value which I don't know how to resolve.
Sample data:
short_description, category, cluster_count
need help password reset, Password reset , 50
Internet access, Connection issue, 10
XXXX Installation, Installation request, 60
Reset my password, Password reset , 55
Thank you in advance.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

@chinkeeparco as per your query you have multiple rows in your csv file for the same combination of category and cluster_count. Which means for the first query duplicates are not being considered because you have performed stats count by category, cluster_count
in the first query and used the same for subsequent query.
In your second query you are removing duplicates by stats count by category, cluster_count
but then you get the duplicate count back using cluster_count*count
as total.
In essence yes there will be discrepancy in your query but which one is correct depends on your use case. If you need duplicate count remove stats count by category, cluster_count
. If you dont need duplicates then your query is showing the correct results.
| makeresults | eval message= "Happy Splunking!!!"
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I suspect that when the cluster_count
value is 1
then there is no value for cluster_count
(so the 1
is implied). If so, then the correct answer should be given by this search:
index=<You should always specify an index> AND source="sap_cluster.csv"
| eval cluster_count = coalesce(cluster_count, 1)
| rename COMMENT AS "At this point, both of your solutions should give the same answer"
| stats sum(cluster_count) AS Total BY category
| eventstats sum(Count) AS Total
| eval perc = round((Count / Total) * 100)
| sort 0 - Count
| table category Count perc
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

@woodcock hello!! thank you! this work as well!!
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

@chinkeeparco as per your query you have multiple rows in your csv file for the same combination of category and cluster_count. Which means for the first query duplicates are not being considered because you have performed stats count by category, cluster_count
in the first query and used the same for subsequent query.
In your second query you are removing duplicates by stats count by category, cluster_count
but then you get the duplicate count back using cluster_count*count
as total.
In essence yes there will be discrepancy in your query but which one is correct depends on your use case. If you need duplicate count remove stats count by category, cluster_count
. If you dont need duplicates then your query is showing the correct results.
| makeresults | eval message= "Happy Splunking!!!"
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

@niketnilay thank you so much for that!! It resolved my issue! I don't know how I can thank you enough.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

@chinkeeparco I am glad the explanation worked for you to get the issue resolved. I have converted my comment to answer. Please accept the answers to mark this question as answered.
Well do think about Splunk Answers community when you run into issues. As far as thanking is concerned do actively participate on Splunk Answers and help others facing issues you have already resolved.
Happy Splunking!
| makeresults | eval message= "Happy Splunking!!!"
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I will! thanks 🙂
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

can you provide an example of the original csv?
Hope I was able to help you. If so, some karma would be appreciated.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

@diogofgm hello, I edit the orginal post and added the sample data 🙂
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

@chinkeeparco,
What's the significance of cluster_count here? If cluster_count is not required in your final result , just use stats count by category|eventstats sum(count) as Total
What goes around comes around. If it helps, hit it with Karma 🙂
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

@renjith.nair hello! thank you for your answer. However, i need the cluster_count. I edited the original post for the sample data 🙂
