I am seraching as below but my join operation is not bringing results from the join for only couple of imei/records. I have 100 different imei number but only 10 of them are not returning any results.
index="etl_pipeline_data" environment=prd source=meta_data origin IN (device_properties, gsm_info,backend_transaction)
|fields _time,tenant,origin,imei,timestamp, timestamp_device,tz_offset_cest
|eval ts_device_epoch=strptime(timestamp_device,"%Y-%m-%dT%H:%M:%S.%3N")| eval ts_device=ts_device_epoch+tz_offset_cest |eval eventdate=strftime(ts_device,"%Y-%m-%d")
|stats latest by tenant,origin,imei,ts_device
|rename latest(*) as *
|stats values(*) by tenant, imei, eventdate
|join type=left imei [|loadjob savedsearch="xx:yyy:DEVICE_TRAINPASS_Report_db" ]
| where imei = 352369082111082
| table imei, eventdate,train2s
As a proof of record in second serach
I have tried to check the data type, there is no issues with that.
I also tried below method instead of join but it's not returning any records as well.
index="etl_pipeline_data" environment=prd source=meta_data origin IN (device_properties, gsm_info,backend_transaction)
|fields _time,tenant,origin,imei,timestamp, timestamp_device,tz_offset_cest
|eval ts_device_epoch=strptime(timestamp_device,"%Y-%m-%dT%H:%M:%S.%3N")| eval ts_device=ts_device_epoch+tz_offset_cest |eval eventdate=strftime(ts_device,"%Y-%m-%d")
|stats latest by tenant,origin,imei,ts_device
|rename latest(*) as *
|stats values(*) by tenant, imei, eventdate
|table imei,eventdate
|append [|loadjob savedsearch="xx:yyyy:DEVICE_TRAINPASS_Report_db" | fields imei,eventdate,trains ]
| where imei = 352369082111082
is there any limitation in Splunk ? Could you please help me to achive this merge operation ?
Hi sercankarvar
The join command has a 50k rows limit in the right-side dataset can be joined with the left-side dataset.
from the docs:
https://docs.splunk.com/Documentation/Splunk/8.2.5/SearchReference/Join
Depending on the 2nd search you are doing (the one you're doing the load job) you could try to use just stats. Can you post this search as well? There are plenty of examples where stats can be use instead of join.
Check this conf presentation from Nick, a Splunk Trust member, addressing this topic:
.conf19 https://conf.splunk.com/files/2019/slides/FNC2751.pdf
.conf20 https://conf.splunk.com/files/2020/slides/TRU1761C.pdf
Sorry but in my case I need use loadjob, how I can combine my query with OR and stats?
I have tried with stats as below but it didn't worked either.
index="etl_pipeline_data" environment=prd source=meta_data origin IN (device_properties, gsm_info,backend_transaction)
|fields _time,tenant,origin,imei,timestamp, timestamp_device,tz_offset_cest
|eval ts_device_epoch=strptime(timestamp_device,"%Y-%m-%dT%H:%M:%S.%3N")| eval ts_device=ts_device_epoch+tz_offset_cest |eval eventdate=strftime(ts_device,"%Y-%m-%d")
|stats latest by tenant,origin,imei,ts_device
|rename latest(*) as *
|stats values(*) by tenant, imei, eventdate
|table imei,eventdate
|append [|loadjob savedsearch="offboarded:konux_devices_and_features:DEVICE_TRAINPASS_Report_db" | fields imei,eventdate,trains | stats sum(trains) by imei eventdate ]
| where imei = 352369082111082
The stats options would be to use instead of the append or join. Both have limits that depending on you data and the expected number of results might not be enough for what you're trying to do. Is the load job a dbconnect query?
Yes it's bringing results from database connection.