Splunk Search

Why is there a discrepancy when grouping by a variable vs searching?

psangli
Explorer

I want to see how many times a user has accessed a database in a given time period. I used

sourcetype= h1 | stats count by database, user | sort -count|stats list(count), values(database) by user

However, after running a search for specific users, the numbers that come up from the above line, do not match up with the number of search results I find by searching for specific users. Is there a reason for the discrepancy ?

Tags (1)
0 Karma
1 Solution

DalJeanis
Legend

A) I don't see anything obvious - you probably (?) don't have over 10K users / databases combinations, but there is a default maximum number of records returned from sort, so you should get in the habit of using sort 0 instead of plainsort.

B) Also, when trying to compare two versions, always snap to the beginning and end of a time period, so you have a fixed set of results between the two to compare on.

C) Always rename count, because occasionally it will bite you when a later stats-type command or timechart gets confused.


Try this and compare to see if it fixes your issue....

earliest=-1d@d latest=@d sourcetype= h1 
| stats count as countbydb by database, user    
| sort 0 - countbydb
| stats sum(countbydb) as theTotalCount, list(countbydb) as theCounts, values(database) as theDBs by user
| where user=="myselecteduser"

earliest=-1d@d latest=@d sourcetype= h1 user="myselecteduser"
| stats count as theTotalCount, values(database) as theDBs by user

If not, then consider something like this

earliest=-1d@d latest=@d sourcetype= h1 
| stats count as countbydb by database, user  
| eval DBandCount  = database." - ".countbydb  
| stats sum(countbydb) as theTotalCount, values(DBandCount) as theDBs by user

View solution in original post

0 Karma

DalJeanis
Legend

A) I don't see anything obvious - you probably (?) don't have over 10K users / databases combinations, but there is a default maximum number of records returned from sort, so you should get in the habit of using sort 0 instead of plainsort.

B) Also, when trying to compare two versions, always snap to the beginning and end of a time period, so you have a fixed set of results between the two to compare on.

C) Always rename count, because occasionally it will bite you when a later stats-type command or timechart gets confused.


Try this and compare to see if it fixes your issue....

earliest=-1d@d latest=@d sourcetype= h1 
| stats count as countbydb by database, user    
| sort 0 - countbydb
| stats sum(countbydb) as theTotalCount, list(countbydb) as theCounts, values(database) as theDBs by user
| where user=="myselecteduser"

earliest=-1d@d latest=@d sourcetype= h1 user="myselecteduser"
| stats count as theTotalCount, values(database) as theDBs by user

If not, then consider something like this

earliest=-1d@d latest=@d sourcetype= h1 
| stats count as countbydb by database, user  
| eval DBandCount  = database." - ".countbydb  
| stats sum(countbydb) as theTotalCount, values(DBandCount) as theDBs by user
0 Karma

psangli
Explorer

I believe the
sort -count
part is wrong. It sorts the number of databases, but does not move the name along with it. The names stay in the same order. How would I go about fixing that?

0 Karma

DalJeanis
Legend

Instead of values(database) use list(database) to retain the record order.

psangli
Explorer

That works! Thanks!

somesoni2
Revered Legend

Use the by clause with sort.

0 Karma

psangli
Explorer
sourcetype="h1" | stats count as countbydb by database, user |sort by countbydb |stats list(countbydb), values(database) by user 

is what I am using. It does not affect the user column. I tried placing the sort by countbydb at the end, but that does not work either.

0 Karma

DalJeanis
Legend

Please give an example of the other search (with the username obfuscated)

0 Karma

psangli
Explorer

sourcetype="h1" user = "user1" database = bdp

In the search, bdp appears 10 times. But in the table, it only appears once.

0 Karma

DalJeanis
Legend

In the table, it should appear only once, with a count of 10.

0 Karma
Get Updates on the Splunk Community!

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...

.conf24 | Personalize your .conf experience with Learning Paths!

Personalize your .conf24 Experience Learning paths allow you to level up your skill sets and dive deeper ...

Threat Hunting Unlocked: How to Uplevel Your Threat Hunting With the PEAK Framework ...

WATCH NOWAs AI starts tackling low level alerts, it's more critical than ever to uplevel your threat hunting ...