Splunk Search

Why are what I think are similar queries returning slightly different values?

Engager

I believe the following two queries are essentially the same but when I run them the former returns results with 10s more events per week then the other. Anyone have an idea on why the small variations?

index=myindex build_type=prod event_type="Setup complete" | bucket _time span=1week | stats count by _time, logged_user_id | stats sum(count) AS UniqueUsersSeen by _time

index=myindex build_type=prod event_type="Setup complete" | bucket _time span=1week | stats dc(logged_user_id) AS UniqueUsersSeen by _time

For example running the following...

index=myindex build_type=prod event_type="Setup complete" | bucket _time span=1week | stats dc(logged_user_id) AS UniqueUsersSeen by _time | appendcols [search index=myindex build_type=prod event_type="Setup complete" | bucket _time span=1week | stats count by _time, logged_user_id | stats sum(count) AS UniqueUsersSeen2 by _time]

Gets me...

_time        UniqueUsersSeen        UniqueUsersSeen2
9/5/2013        9        9
9/12/2013        39475        39501
9/19/2013        45320        45360
9/26/2013        15049        15061
10/3/2013        7573        7580
10/10/2013        4055        4056
10/17/2013        3346        3352
10/24/2013        3169        3170
10/31/2013        2685        2687
11/7/2013        3146        3147
11/14/2013        3734        3737
11/21/2013        3023        3026
11/28/2013        2365        2368
12/5/2013        2450        2454
12/12/2013        5617        5617
12/19/2013        11907        11912
12/26/2013        10782        10782
1/2/2014        8163        8165
1/9/2014        5078        5080
1/16/2014        3014        3016
1/23/2014        2400        2402
1/30/2014        1965        1967
2/6/2014        1732        1748
2/13/2014        1587        1588
2/20/2014        1499        1542
2/27/2014        1574        1691
3/6/2014        1502        1588
3/13/2014        1599        1735
3/20/2014        2034        2239
3/27/2014        1211        1283
Tags (3)
0 Karma
1 Solution

Legend

These two searches are not identical, because the first one looks at the count of events by userid and time whereas the other looks at the distinct count of users by time. If all users occurred in only one event each this would be the same, but if any user occurs in more than one event it won't - and it seems likely this is what is happening here that makes you see a difference.

View solution in original post

Revered Legend

First query gives you NoOfUserLogins (not UniqueUsersSeen)

0 Karma

Legend

These two searches are not identical, because the first one looks at the count of events by userid and time whereas the other looks at the distinct count of users by time. If all users occurred in only one event each this would be the same, but if any user occurs in more than one event it won't - and it seems likely this is what is happening here that makes you see a difference.

View solution in original post

Engager

Oops yeah the following would be equivalent...

index=myindex build_type=prod event_type="Setup complete" | bucket _time span=1week | stats count by _time, logged_user_id | stats count by _time | stats sum(count) AS UniqueUsersSeen by _time

...and...

index=myindex build_type=prod event_type="Setup complete" | bucket _time span=1week | stats dc(logged_user_id) AS UniqueUsersSeen by _time

...knew it was obvious once I stepped back from it!

Thanks folks

0 Karma
Don’t Miss Global Splunk
User Groups Week!

Free LIVE events worldwide 2/8-2/12
Connect, learn, and collect rad prizes
and swag!