Splunk Search

Find records which both do and do not match user search criteria

pbarna
Explorer

I have a dataset that looks like: (id, foo, bar, user) that I want to show results for on a dashboard.
Given an input combination of values for foo and bar, I want to know which ids both
    a) have at least one row that has BOTH of those values; and
    b) have at least one row that has NEITHER of those values
and then count the number of such ids by user.

For example, a search on (foo=A, bar=1) for the data

idfoobaruser
1234A1admin
1234B2admin
1234C3other_user
abcdA1admin
abcdA2admin

 

would count 1234, but not abcd, and return

userids
admin1
other_user1

 

Each search parameter can be a single value or a comma-separated list. Empty values are permitted in up to one field at a time.

This is the closest I have been able to get:

index="data" [
    | tstats count where index="data" AND foo IN (A) AND bar IN (1) by id
    | fields id

] AND NOT (foo IN (A) OR bar IN (1))
| fields id, user

| stats dc(id) as ids by user

I believe the query does what I want it to, but unfortunately am constrained by the hard limit of 10,500 results for subsearches. Is there a way to get the data I want without an intermediate command limiting my results?

Labels (2)
0 Karma
1 Solution

PickleRick
SplunkTrust
SplunkTrust

Ok. That's a bit clearer now. Still, the solution will be similar if I understood you correctly this time

index=whatever
| eval matchstate=case(in(foo,1,2,3) AND in(bar,4,5,6),1,NOT in(foo,1,2,3) AND NOT in(foo,4,5,6),2,1=1,0)
| stats values(matchstate) as matchstate values(user) as users by id
| where matchstate=1 AND matchstate=2
| fields - matchstate
| mvexpand user
| state count by user

 

View solution in original post

PickleRick
SplunkTrust
SplunkTrust
index=whatever
| eval matchstate=case(in(foo,1,2,3) AND in(bar,4,5,6),1,NOT in(foo,1,2,3) AND NOT in(foo,4,5,6),2,1=1,0)
| stats values(matchstate) as matchstate by id user
| where matchstate=1 AND matchstate=2
| state count by user
0 Karma

pbarna
Explorer

I probably edited my original post after this response, but doing 

| stats ... by id user

won't work.

I tried 

| eventstats values(matchstate) as matchstate by id

but similarly to the other proposed solution, the job autofinalizes for a relatively small amount of data and ultimately returns no results.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

What do you mean by "won't work"?

Also, stats and eventstats are not the same.

0 Karma

pbarna
Explorer

Correct me if I'm wrong - using multiple in fields in by counts by unique value tuples of all those fields right? 
So, given the sample table I posted:

| eval matchstate=case(in(foo,1,2,3) AND in(bar,4,5,6),1,NOT in(foo,1,2,3) AND NOT in(foo,4,5,6),2,1=1,0
| stats values(matchstate) as matchstate by id user


will give back

idfoobarusermatchstate
1234A1admin1, 2
1234C3other_user2
abcdA1admin

1, 0

 

Then 

| where matchstate=1 AND matchstate=2

will drop the row for other_user, when I want my results to include that row.

I mentioned eventstats because I would still need the user column after doing the stats values, but can't include it using by.

 

0 Karma

PickleRick
SplunkTrust
SplunkTrust

The other_user doesn't match your specs.

Let me quote you. For each user you wanted

"which ids both
    a) have at least one row that has BOTH of those values; and
    b) have at least one row that has NEITHER of those values"

 

0 Karma

pbarna
Explorer

Ah sorry if I was unclear, I know this problem is a little convoluted.

I want to find the ids that fulfil the conditions *regardless of user*, THEN count the number of such ids by user.

VS count the ids that fulfil the conditions for a single user.

For example, if you removed the second row of my test data, the result would be the same (returning both admin and other_user).

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Ok. That's a bit clearer now. Still, the solution will be similar if I understood you correctly this time

index=whatever
| eval matchstate=case(in(foo,1,2,3) AND in(bar,4,5,6),1,NOT in(foo,1,2,3) AND NOT in(foo,4,5,6),2,1=1,0)
| stats values(matchstate) as matchstate values(user) as users by id
| where matchstate=1 AND matchstate=2
| fields - matchstate
| mvexpand user
| state count by user

 

johnhuang
Motivator

Give this a shot:

 

index="data"
| eval match_ct=IF(foo="A", 1, 0)
| eval match_ct=IF(bar="1", match_ct+1, match_ct)
| search match_ct IN (0, 2)
| stats dc(id) AS ids by user

pbarna
Explorer

This is close, but I only want the ids that have rows where match_ct=0 AND match_ct=2.

A thought would be to do

| stats dc(match_ct) as matches by id
| search matches=2

But that loses user info. I updated the original post to show why 

| stats dc(match_ct) as matches by id, user 

also wouldn't work.

0 Karma

johnhuang
Motivator
| eventstats dc(match_ct) as dc_matches by id
| search dc_matches=2

 

0 Karma

pbarna
Explorer

I think this does what I'm looking for, but the job autofinalizes for even a relatively small amount of data.

0 Karma
Get Updates on the Splunk Community!

New in Observability - Improvements to Custom Metrics SLOs, Log Observer Connect & ...

The latest enhancements to the Splunk observability portfolio deliver improved SLO management accuracy, better ...

Improve Data Pipelines Using Splunk Data Management

  Register Now   This Tech Talk will explore the pipeline management offerings Edge Processor and Ingest ...

3-2-1 Go! How Fast Can You Debug Microservices with Observability Cloud?

Register Join this Tech Talk to learn how unique features like Service Centric Views, Tag Spotlight, and ...