Splunk Search

Discrepany in total count

Explorer

Hello guys,

I have the following syntax and data:

alt text

However, there is a discrepancy with the total count per category.

For example,
Password Reset should be 3106 (when I manually count it) but in the screenshot provided as you can see, the total count only has 3007 😞 am i missing something?

alt text

in the second screenshot, as you can see, it displays the correct count; however, it has duplicated value which I don't know how to resolve.

Sample data:

short_description, category, cluster_count

need help password reset, Password reset , 50
Internet access, Connection issue, 10
XXXX Installation, Installation request, 60
Reset my password, Password reset , 55

Thank you in advance.

0 Karma
1 Solution

SplunkTrust
SplunkTrust

@chinkeeparco as per your query you have multiple rows in your csv file for the same combination of category and cluster_count. Which means for the first query duplicates are not being considered because you have performed stats count by category, cluster_count in the first query and used the same for subsequent query.

In your second query you are removing duplicates by stats count by category, cluster_count but then you get the duplicate count back using cluster_count*count as total.

In essence yes there will be discrepancy in your query but which one is correct depends on your use case. If you need duplicate count remove stats count by category, cluster_count. If you dont need duplicates then your query is showing the correct results.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

View solution in original post

Esteemed Legend

I suspect that when the cluster_count value is 1 then there is no value for cluster_count (so the 1 is implied). If so, then the correct answer should be given by this search:

index=<You should always specify an index> AND source="sap_cluster.csv"
| eval cluster_count = coalesce(cluster_count, 1)
| rename COMMENT AS "At this point, both of your solutions should give the same answer"
| stats sum(cluster_count) AS Total BY category
| eventstats sum(Count) AS Total
| eval perc = round((Count / Total) * 100)
| sort 0 - Count
| table category Count perc

Explorer

@woodcock hello!! thank you! this work as well!!

0 Karma

SplunkTrust
SplunkTrust

@chinkeeparco as per your query you have multiple rows in your csv file for the same combination of category and cluster_count. Which means for the first query duplicates are not being considered because you have performed stats count by category, cluster_count in the first query and used the same for subsequent query.

In your second query you are removing duplicates by stats count by category, cluster_count but then you get the duplicate count back using cluster_count*count as total.

In essence yes there will be discrepancy in your query but which one is correct depends on your use case. If you need duplicate count remove stats count by category, cluster_count. If you dont need duplicates then your query is showing the correct results.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

View solution in original post

Explorer

@niketnilay thank you so much for that!! It resolved my issue! I don't know how I can thank you enough.

0 Karma

SplunkTrust
SplunkTrust

@chinkeeparco I am glad the explanation worked for you to get the issue resolved. I have converted my comment to answer. Please accept the answers to mark this question as answered.

Well do think about Splunk Answers community when you run into issues. As far as thanking is concerned do actively participate on Splunk Answers and help others facing issues you have already resolved.

Happy Splunking!

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

Explorer

I will! thanks 🙂

0 Karma

Motivator

can you provide an example of the original csv?

------------
Hope I was able to help you. If so, an upvote would be appreciated.
0 Karma

Explorer

@diogofgm hello, I edit the orginal post and added the sample data 🙂

0 Karma

SplunkTrust
SplunkTrust

@chinkeeparco,
What's the significance of cluster_count here? If cluster_count is not required in your final result , just use stats count by category|eventstats sum(count) as Total

0 Karma

Explorer

@renjith.nair hello! thank you for your answer. However, i need the cluster_count. I edited the original post for the sample data 🙂

0 Karma