Splunk Search
Highlighted

How do you get join functionality without using subsearch?

Path Finder

Hey Splunkers,

Here is my original query where the sub search is getting truncated to 50000 records.

index = abc sourcetype=abc_errors 
| rename  device.headwaters.watermark.core.DeviceInfo.receiverId.string AS receiverId
| fields receiverId 
| join receiverId[search index=abc sourcetype=abc_temp|fields receiverId billingId]
| table receiverId billingId

I am trying to write a stats command for it so that I don't have to use join. Here is what I thought might work but doesn't.

index = abc (sourcetype=abc_errors OR sourcetype=abc_temp)
  | fields sourcetype receiverId billingId device.headwaters.watermark.core.DeviceInfo.receiverId.string
  | rename device.headwaters.watermark.core.DeviceInfo.receiverId.string AS receiverId 
  | dedup receiverId sourcetype
  | stats count AS total by receiverId
  | where total>1
  | table receiverId

Can someone tell me what I might be doing wrong? I know there is something funky about the dedup, but I can't think of anything else right now.

Thanks,
Divyank

Tags (3)
0 Karma
Highlighted

Re: How do you get join functionality without using subsearch?

Super Champion

Please try like below

index=abc sourcetype=abc_temp [index = abc sourcetype=abc_errors  | rename device.headwaters.watermark.core.DeviceInfo.receiverId.string AS receiverId | stats count by receiverId| fields receiverId]
|fields receiverId billingId
0 Karma
Highlighted

Re: How do you get join functionality without using subsearch?

Path Finder

This approach is using a subsearch? That is the problem that we are facing, subsearch is limited to 50000 rows

0 Karma
Highlighted

Re: How do you get join functionality without using subsearch?

Super Champion

Please note, i'm doing a stats count by receiverId within second search. So you still expect the unique receiverId to be greater than 50k?

0 Karma
Highlighted

Re: How do you get join functionality without using subsearch?

Path Finder

Yeah it would be closer to a million.

0 Karma
Highlighted

Re: How do you get join functionality without using subsearch?

SplunkTrust
SplunkTrust

try this:

index = abc (sourcetype=abc_errors OR sourcetype=abc_temp)
| fields sourcetype receiverId billingId device.headwaters.watermark.core.DeviceInfo.receiverId.string 
| eval receiver_id = coalesce(receiverId, device.headwaters.watermark.core.DeviceInfo.receiverId.string)
| stats count as total by reciver_id
| where total>1 
| table receiver_id

hope it helps

0 Karma
Highlighted

Re: How do you get join functionality without using subsearch?

Path Finder

This did not work It wouldn't give any results. Also we are not comparing the receiverId from both sourcetype? So for example if one sourcetype has more than one value for that receiverId it would still show up in the results? We want only the common receiverId between the sourcetypes to show

0 Karma
Highlighted

Re: How do you get join functionality without using subsearch?

Influencer
index=abc (sourcetype=abc_temp OR sourcetype=abc_errors)| fields sourcetype receiverId billingId device.headwaters.watermark.core.DeviceInfo.receiverId.string | rename device.headwaters.watermark.core.DeviceInfo.receiverId.string AS receiverId | dedup sourcetype receiverId|stats count(eval(sourcetype="abc_temp")) as temp, count(eval(sourcetype="abc_errors")) as errors by receiverId| where temp=errors
0 Karma
Highlighted

Re: How do you get join functionality without using subsearch?

Path Finder

This did not work. It seems like the where and dedup function both are not working

0 Karma
Highlighted

Re: How do you get join functionality without using subsearch?

Influencer

Try doing a sort on sourcetype receiverId before dedup. What is the output you are getting using above search, you can test it removing where clause and see the values of temp and errors for each receiverId

0 Karma