Splunk Search

Using stats to join data instead of transaction

Kupo
Engager

I have two sources that have a common field (user) and am currently using transaction to join the user_a with the source_b_field. This query works fine.

 

index=index_a (sourcetype=source_a OR sourcetype=source_b)
| transaction startswith="string_start" endswith="string_end" maxspan=1s maxevents=2
| where (user_a = user_b)
| stats count by user_a, source_b_field

 

I figured it would be easy enough to use stats instead to increase execution efficiency  but I can't seem to get it quite right. The issue is that I need a left/inner join instead of a full join as I am only looking for users from source_a.

Here is the stats query that essentially just returns data from source_b as source_a is a subset of source_b.

 

index=index_a (sourcetype=source_a OR sourcetype=source_b)
| user_a=if(sourcetype=="source_b",user_b,user_a)
| stats count by user_a, source_b_field

 

Is there a way to join user_a with source_b_field via stats? I feel that I am missing something obvious.

Labels (2)
0 Karma
1 Solution

isoutamo
SplunkTrust
SplunkTrust

Kupo
Engager

This is amazing, I have been searching for something like this for awhile. I really appreciate it.

Here is the solution that worked for my use case

index=index_a (sourcetype=source_a OR sourcetype=source_b)
| user_a=if(sourcetype=="source_b",user_b,user_a)
| stats dc(sourcetype) as sourcetypeCount values(source_b_field) by user_a
| where sourcetypeCount = 2
0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...