Solved: Re: Comparing two data sets from the same timefram...

Tisiphone_1 · ‎06-18-2010

I am trying to compare the results of two searches that share a common timeframe and index, with a negation. The common resulting field may occur multiple times in both searches. As an example:

Search 1:

index="stuff" status="LoginSuccessful"

and Search 2 is:

index="stuff" status="LoginFailed"

If username is in an event in search 2 but NOT in an event in search 1, I want to see it. (So, show me all of the users that had a failed login and never had a successful login.)

The only way I have found to do this is to output one set of results to a csv and use lookup to search one. Even if this were two separate indices, I'm not clear on how you would negate the results of an entire join.

I figure there has to be a simpler way to do this...

gkanapathy · ‎06-19-2010

There are several easier ways to do this. An efficient way is:

 index=stuff status=loginsuccessful OR status=loginfailed | stats count(eval(status=="LoginSuccessful")) as c_success, count((eval(status=="LoginFailed"))) as c_failure by user | where c_failure > 0 AND c_success == 0

Similar (but somewhat less amenable to map-reduce treatment than the previous, if you have multiple indexer nodes) is:

 index=stuff status=loginsuccessful OR status=loginfailed | transaction user | where status=="LoginFailed" AND NOT status=="LoginSuccessful"

And the obvious way (which is just a concise version of what you did) is to use a subsearch:

 index=stuff status=loginfailed NOT [ search index=stuff status=loginsuccessful | dedup user | fields user ]

But this is relatively inefficient and is subject to limits on subsearch and format that must be increased in the system if you have a lot of users.

View solution in original post

gkanapathy · ‎06-19-2010

There are several easier ways to do this. An efficient way is:

 index=stuff status=loginsuccessful OR status=loginfailed | stats count(eval(status=="LoginSuccessful")) as c_success, count((eval(status=="LoginFailed"))) as c_failure by user | where c_failure > 0 AND c_success == 0

Similar (but somewhat less amenable to map-reduce treatment than the previous, if you have multiple indexer nodes) is:

 index=stuff status=loginsuccessful OR status=loginfailed | transaction user | where status=="LoginFailed" AND NOT status=="LoginSuccessful"

And the obvious way (which is just a concise version of what you did) is to use a subsearch:

 index=stuff status=loginfailed NOT [ search index=stuff status=loginsuccessful | dedup user | fields user ]

But this is relatively inefficient and is subject to limits on subsearch and format that must be increased in the system if you have a lot of users.

Tisiphone_1 · ‎06-19-2010

Thanks! I had my syntax mixed up on the last. Now it works!

Comparing two data sets from the same timeframe and index?

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Think Like an Architect: Introducing the Splunk Certified Cybersecurity Defense ...

Best Practices: Splunk auto adjust pipeline queue

Announcing Modern Navigation: A New Era of Splunk User Experience

Join the Conversation