Splunk Search

How to achieve a query for two events based on intersection of common value - without using join?

lmonahan
Path Finder

I want to run a query where:

1. Query1 returns resultset1containing myEvent1.uid

2. Query2 returns resultset2 containing myEvent2.uid which is a subset of the myEvent1uid values.

3. Filter myEvent1 events and discard any that don't have a matching myEvent2.uid.

This can be done easily with an inner join but the result2 dataset is larger than 50k so I cannot use a join.

What I want is to do an inner join without using join😀  (I'm also practicing not using join, in general, but I really can't use join in this case.)

Saw some other posts that use join and other tricks and tried different solutions with coalesce() and also creating a new fields but haven't figured out a way that worked.

Thanks in advance!

 

Labels (1)
0 Karma
1 Solution

PickleRick
SplunkTrust
SplunkTrust

The typical approach would be to get your events to contain the common field - let's call it "uid" and do stats values(*) as * by uid.

The trick is to perform field shuffling (rename, conditional evals) to make fields from different types of events not overlap each other.

The easiest way is to do

search1
| append [ search2 ]
| stats values(*) as * by common_field

But this is ugly since it uses a subsearch and append.

More sophisticated (but tricky solution) is to use a single search to find both result sets and conditionally "remodel" the fields.

(Conditions1) OR (conditions2)
| eval field1=if(...)
| ...
| stats values(*) as * by common_field

 

View solution in original post

PickleRick
SplunkTrust
SplunkTrust

The typical approach would be to get your events to contain the common field - let's call it "uid" and do stats values(*) as * by uid.

The trick is to perform field shuffling (rename, conditional evals) to make fields from different types of events not overlap each other.

The easiest way is to do

search1
| append [ search2 ]
| stats values(*) as * by common_field

But this is ugly since it uses a subsearch and append.

More sophisticated (but tricky solution) is to use a single search to find both result sets and conditionally "remodel" the fields.

(Conditions1) OR (conditions2)
| eval field1=if(...)
| ...
| stats values(*) as * by common_field

 

lmonahan
Path Finder

Thank you, you gave me a great brainstorm!

Using your advice I set out to create some new fields and to manipulate them. 

It made me notice that for Query1 I was inadvertently fetching multiple occurrences of events matching myEvent1.common_uid_field value rather than one representative event that is sufficient. I needed to constrain Query1 further.

By limiting Query1 to return only one event per common_uid_field I could now use one query for both event1 or event2 and count by common_uid_field. And only when count=2 does it mean the value is in both events.

My query is now of the form:

index=myindex (event1) OR (event2)
| stats values(field_in_event1) AS ImportantField count by common_uid_field
| where count=2
| stats count BY ImportantField
0 Karma
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.0.2 Availability: On cloud and On-premise!

A few months ago, we released Splunk Enterprise Security 8.0 for our cloud customers. Today, we are excited to ...

Logs to Metrics

Logs and Metrics Logs are generally unstructured text or structured events emitted by applications and written ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...