Hey Splunkers,
Here is my original query where the sub search is getting truncated to 50000 records.
index = abc sourcetype=abc_errors
| rename device.headwaters.watermark.core.DeviceInfo.receiverId.string AS receiverId
| fields receiverId
| join receiverId[search index=abc sourcetype=abc_temp|fields receiverId billingId]
| table receiverId billingId
I am trying to write a stats
command for it so that I don't have to use join
. Here is what I thought might work but doesn't.
index = abc (sourcetype=abc_errors OR sourcetype=abc_temp)
| fields sourcetype receiverId billingId device.headwaters.watermark.core.DeviceInfo.receiverId.string
| rename device.headwaters.watermark.core.DeviceInfo.receiverId.string AS receiverId
| dedup receiverId sourcetype
| stats count AS total by receiverId
| where total>1
| table receiverId
Can someone tell me what I might be doing wrong? I know there is something funky about the dedup
, but I can't think of anything else right now.
Thanks,
Divyank
I figured out a way to do it, I took the coalesce idea from @adonio . Thank you for that. Here is the solution query:
index = abc (sourcetype=abc_errors OR sourcetype=abc_temp)
| fields sourcetype receiverId billingId device.headwaters.watermark.core.DeviceInfo.receiverId.string
| rename device.headwaters.watermark.core.DeviceInfo.receiverId.string AS Receiver
| eval receiver_id = coalesce(Receiver, receiverId )
| dedup receiver_id sourcetype
| stats count(sourcetype) AS total BY receiver_id
| where total>1
| stats count(receiver_id) AS match
Thank you everyone for your input
I figured out a way to do it, I took the coalesce idea from @adonio . Thank you for that. Here is the solution query:
index = abc (sourcetype=abc_errors OR sourcetype=abc_temp)
| fields sourcetype receiverId billingId device.headwaters.watermark.core.DeviceInfo.receiverId.string
| rename device.headwaters.watermark.core.DeviceInfo.receiverId.string AS Receiver
| eval receiver_id = coalesce(Receiver, receiverId )
| dedup receiver_id sourcetype
| stats count(sourcetype) AS total BY receiver_id
| where total>1
| stats count(receiver_id) AS match
Thank you everyone for your input
@DalJeanis I am sorry for the direct tag, but you answered one of these questions for me perfectly so wanted to se if you can help me again
index=abc (sourcetype=abc_temp OR sourcetype=abc_errors)| fields sourcetype receiverId billingId device.headwaters.watermark.core.DeviceInfo.receiverId.string | rename device.headwaters.watermark.core.DeviceInfo.receiverId.string AS receiverId | dedup sourcetype receiverId|stats count(eval(sourcetype="abc_temp")) as temp, count(eval(sourcetype="abc_errors")) as errors by receiverId| where temp=errors
This did not work. It seems like the where and dedup function both are not working
Try doing a sort on sourcetype receiverId before dedup. What is the output you are getting using above search, you can test it removing where clause and see the values of temp and errors for each receiverId
try this:
index = abc (sourcetype=abc_errors OR sourcetype=abc_temp)
| fields sourcetype receiverId billingId device.headwaters.watermark.core.DeviceInfo.receiverId.string
| eval receiver_id = coalesce(receiverId, device.headwaters.watermark.core.DeviceInfo.receiverId.string)
| stats count as total by reciver_id
| where total>1
| table receiver_id
hope it helps
This did not work It wouldn't give any results. Also we are not comparing the receiverId from both sourcetype? So for example if one sourcetype has more than one value for that receiverId it would still show up in the results? We want only the common receiverId between the sourcetypes to show
Please try like below
index=abc sourcetype=abc_temp [index = abc sourcetype=abc_errors | rename device.headwaters.watermark.core.DeviceInfo.receiverId.string AS receiverId | stats count by receiverId| fields receiverId]
|fields receiverId billingId
This approach is using a subsearch? That is the problem that we are facing, subsearch is limited to 50000 rows
Please note, i'm doing a stats count by receiverId within second search. So you still expect the unique receiverId to be greater than 50k?
Yeah it would be closer to a million.