Splunk Search

Why is there a discrepancy when grouping by a variable vs searching?

psangli
Explorer

I want to see how many times a user has accessed a database in a given time period. I used

sourcetype= h1 | stats count by database, user | sort -count|stats list(count), values(database) by user

However, after running a search for specific users, the numbers that come up from the above line, do not match up with the number of search results I find by searching for specific users. Is there a reason for the discrepancy ?

Tags (1)
0 Karma
1 Solution

DalJeanis
Legend

A) I don't see anything obvious - you probably (?) don't have over 10K users / databases combinations, but there is a default maximum number of records returned from sort, so you should get in the habit of using sort 0 instead of plainsort.

B) Also, when trying to compare two versions, always snap to the beginning and end of a time period, so you have a fixed set of results between the two to compare on.

C) Always rename count, because occasionally it will bite you when a later stats-type command or timechart gets confused.


Try this and compare to see if it fixes your issue....

earliest=-1d@d latest=@d sourcetype= h1 
| stats count as countbydb by database, user    
| sort 0 - countbydb
| stats sum(countbydb) as theTotalCount, list(countbydb) as theCounts, values(database) as theDBs by user
| where user=="myselecteduser"

earliest=-1d@d latest=@d sourcetype= h1 user="myselecteduser"
| stats count as theTotalCount, values(database) as theDBs by user

If not, then consider something like this

earliest=-1d@d latest=@d sourcetype= h1 
| stats count as countbydb by database, user  
| eval DBandCount  = database." - ".countbydb  
| stats sum(countbydb) as theTotalCount, values(DBandCount) as theDBs by user

View solution in original post

0 Karma

DalJeanis
Legend

A) I don't see anything obvious - you probably (?) don't have over 10K users / databases combinations, but there is a default maximum number of records returned from sort, so you should get in the habit of using sort 0 instead of plainsort.

B) Also, when trying to compare two versions, always snap to the beginning and end of a time period, so you have a fixed set of results between the two to compare on.

C) Always rename count, because occasionally it will bite you when a later stats-type command or timechart gets confused.


Try this and compare to see if it fixes your issue....

earliest=-1d@d latest=@d sourcetype= h1 
| stats count as countbydb by database, user    
| sort 0 - countbydb
| stats sum(countbydb) as theTotalCount, list(countbydb) as theCounts, values(database) as theDBs by user
| where user=="myselecteduser"

earliest=-1d@d latest=@d sourcetype= h1 user="myselecteduser"
| stats count as theTotalCount, values(database) as theDBs by user

If not, then consider something like this

earliest=-1d@d latest=@d sourcetype= h1 
| stats count as countbydb by database, user  
| eval DBandCount  = database." - ".countbydb  
| stats sum(countbydb) as theTotalCount, values(DBandCount) as theDBs by user
0 Karma

psangli
Explorer

I believe the
sort -count
part is wrong. It sorts the number of databases, but does not move the name along with it. The names stay in the same order. How would I go about fixing that?

0 Karma

DalJeanis
Legend

Instead of values(database) use list(database) to retain the record order.

psangli
Explorer

That works! Thanks!

somesoni2
Revered Legend

Use the by clause with sort.

0 Karma

psangli
Explorer
sourcetype="h1" | stats count as countbydb by database, user |sort by countbydb |stats list(countbydb), values(database) by user 

is what I am using. It does not affect the user column. I tried placing the sort by countbydb at the end, but that does not work either.

0 Karma

DalJeanis
Legend

Please give an example of the other search (with the username obfuscated)

0 Karma

psangli
Explorer

sourcetype="h1" user = "user1" database = bdp

In the search, bdp appears 10 times. But in the table, it only appears once.

0 Karma

DalJeanis
Legend

In the table, it should appear only once, with a count of 10.

0 Karma
Get Updates on the Splunk Community!

Database Performance Sidebar Panel Now on APM Database Query Performance & Service ...

We’ve streamlined the troubleshooting experience for database-related service issues by adding a database ...

IM Landing Page Filter - Now Available

We’ve added the capability for you to filter across the summary details on the main Infrastructure Monitoring ...

Dynamic Links from Alerts to IM Navigators - New in Observability Cloud

Splunk continues to improve the troubleshooting experience in Observability Cloud with this latest enhancement ...