Splunk Search

User Query count based on conditions

Path Finder

I've below line in my logs:

[2013-01-15 20:06:51:641 GMT+00:00] INFO #new# userid=1234 chair_count=1 table_count=1 sofaAvailable=true
[2013-02-15 21:06:51:642 GMT+00:00] INFO userid=1234 chair_count=1 table_count=0 sofaAvailable=false
[2013-03-15 22:06:51:643 GMT+00:00] INFO #new# userid=3452 chair_count=1 table_count=1 sofaAvailable=true
[2013-04-15 23:06:51:644 GMT+00:00] INFO #new# userid=1234 chair_count=2 table_count=3 sofaAvailable=false
[2013-05-01 10:06:51:645 GMT+00:00] INFO #new# userid=3564 chair_count=1 table_count=2 sofaAvailable=true
[2013-05-05 11:06:51:646 GMT+00:00] INFO #new# userid=2443 chair_count=1 table_count=1 sofaAvailable=true
[2013-05-07 12:06:51:647 GMT+00:00] INFO #new# userid=2265 chair_count=1 table_count=1 sofaAvailable=false
[2013-05-01 10:06:51:645 GMT+00:00] INFO #new# userid=3564 chair_count=1 table_count=0 sofaAvailable=true

Can any one confirm if below query is correct to get the count of all users who are having at least 1 chair & who are new users (by looking at tag #new#) & whose sofaAvailable is true?

Note that if there are multiple entries for the same user in the log, then I need to take the chair count & table count of the latest entry only.

#new# chair_count > 0 | stats count

Here, the output should be 3

Also, can you confirm if below query is correct to get count of all users who are having at least 1 chair & who have at least 1 table & who are new users (by looking at tag #new#) & whose sofaAvailable is true?

#new# chair_count > 0 table_count > 0 | stats count

Here, the output should be 2

Thanks!

Tags (3)
0 Karma
1 Solution

Builder

What about this search: #new# chaircount > 0 tablecount > 0 sofaAvailable=true |stats latest(chaircount) as chaircount latest(tablecount) as tablecount latest(sofaAvailable) as sofaAvailable latest(_raw) by userid |stats count

I think this will be fast as we are getting all results in first search not in second(after |)and also search result will be less as #new# chaircount > 0 tablecount > 0 sofaAvailable=true and specfic than only #new#, but please give suggestions then it will be good....

View solution in original post

0 Karma

Path Finder

Quick question: The 2nd part which is

...| stats latest(chair_count) as chair_count latest(table_count) as table_count latest(sofaAvailable) as sofaAvailable latest(_raw) by userid

here, why u need to stats all 3 together? Why can't we just do as shown below?

...| stats latest(chair_count) by userid | stats count

Builder

try this also

...| dedup userid | stats count

0 Karma

Builder

What about this search: #new# chaircount > 0 tablecount > 0 sofaAvailable=true |stats latest(chaircount) as chaircount latest(tablecount) as tablecount latest(sofaAvailable) as sofaAvailable latest(_raw) by userid |stats count

I think this will be fast as we are getting all results in first search not in second(after |)and also search result will be less as #new# chaircount > 0 tablecount > 0 sofaAvailable=true and specfic than only #new#, but please give suggestions then it will be good....

View solution in original post

0 Karma

Path Finder

Can any one answer my question?

0 Karma

Path Finder

Quick question: The 2nd part which is "...| stats latest(chaircount) as chaircount latest(tablecount) as tablecount latest(sofaAvailable) as sofaAvailable latest(raw) by userid", here, why u need to stats all 3 together? Why can't we just do "...| stats latest(chaircount) by userid | stats count"?

0 Karma

Motivator

For your first search, you need to search based on the latest information for each user.
You can adjust search terms order for optimization, but here is a sample search.

#new# | stats latest(chair_count) as chair_count latest(table_count) as table_count latest(sofaAvailable) as sofaAvailable latest(_raw) by userid | where chair_count>0 and sofaAvailable="true"

For your 2nd search,

#new# | stats latest(chair_count) as chair_count latest(table_count) as table_count latest(sofaAvailable) as sofaAvailable latest(_raw) by userid | where chair_count>0 and table_count>0 and sofaAvailable="true"

By adding "| stats" count after each earch, you can get the number of users with your criteria.

For example,

$ /opt/splunk/bin/splunk search '#new# | stats latest(chair_count) as chair_count latest(table_count) as table_count latest(sofaAvailable) as sofaAvailable latest(_raw) by userid | where chair_count>0 and sofaAvailable="true" | stats count'

count
-----
    3

Hope this helps.

0 Karma

Builder

yes same concept I have as all events are filterd out in begining in my search 🙂

0 Karma

Motivator

kml_uvce's search is faster as all unnecessary events are filtered out at the beginning. I was trying to show basic idea step by step in the search 🙂 also you can customize my search to do similar analysis.

0 Karma

Path Finder

Can any Splunk Expert comment about the performance among these 3 different answers & tell us which one is better?

0 Karma

Path Finder

Thanks for the answer. I would like to know if below query is also an alternate correct answer or not for the 2nd question:

#new# chair_count > 0 table_count > 0 sofaAvailable=true | stats first(table_count) as tcount, first(sofaAvailable) as sofa first(chair_count) as ccount by userid | where sofa = "true" and tcount > 0 and ccount > 0 | stats count

If above is also correct, then which one is best in terms of performance?

0 Karma