Splunk Search

Depth of nested join and sub search

gregb
Explorer

I have an odd problem related to nested joins on 4.3.2. I am attempting to put together a report on latency across all components in an event based architecture.

An event enters ComponentA -> ComponentB -> ComponentC -> ComponentD -> ... -> ComponentZ -> generate notification event (all with different correlationIds across each hand off).

I am running into the situation where the 3rd nested join/sub search fails in the complex query, while exactly the same query (minus the nested "[search ... ]") which reports on ComponentB->ComponentC works as expected.

Anyone have any idea?

The initial chain:
index=myindex eventtype="compA_start" compA_entry_name=bookTrade
| dedup request_id
| join request_id [search index=myindex eventtype=compA_end
| eval end_time=time ]
| fields end_time, request_id
| eval compA_total_time = end_time-_time
| join request_id [search index=myindex eventtype=compB

| rename req_id as request_id
| dedup request_id
| eval compB_start_time = _time - all
| rename trannum_e as compC_tran_num
|rename all as compB_total_time
| rename source as compB_source
| join outer compC_tran_num [search index=myindex sourcetype=compC
| eval compC_time = _time
| rename compC_sequence_number as evtId ]]

This works in isolation:

index=myindex sourcetype=compC
| eval compC_time = _time
| rename compC_sequence_number as evtId
| join outer evtId [search index=myindex eventtype=compD_notification
| eval compD_key_time = _time ]

index=myindex eventtype="compA_start" compA_entry_name=bookTrade
| dedup request_id
| join request_id [search index=myindex eventtype=compA_end
| eval end_time=time ]
| fields end_time, request_id
| eval compA_total_time = end_time-_time
| join request_id [search index=myindex eventtype=compB

| rename req_id as request_id
| dedup request_id
| eval compB_start_time = _time - all
| rename trannum_e as compC_tran_num
| rename all as compB_total_time
| rename source as compB_source
| join outer compC_tran_num [search index=myindex sourcetype=compC
| eval compC_time = _time | rename compC_sequence_number as evtId
| join outer evtId [search index=myindex eventtype=compD_notification
| eval compD_key_time = _time ]
]]

I have also validated that the evtId is populated in the reports for each.

Tags (2)
0 Karma

cphair
Builder

You need to use join type=outer, not join outer. I would guess the second way looks for a field called outer, doesn't find it, and falls back to an inner join on compC_tran_num. I just tried both syntax versions on my own data, and they do return different results.

That said, three nested subsearches is going to be pretty slow. Have you looked into using variants on "transaction request_id"?

0 Karma

gregb
Explorer

Let me try the type=outer, but I think I tried with just "join" already. The problem with the transaction is that it requires a unifying correlation id across all the events and there isnt one. It needs to be stitched together from one log file to the next via differing correlation ids

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In September, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...

New in Observability - Improvements to Custom Metrics SLOs, Log Observer Connect & ...

The latest enhancements to the Splunk observability portfolio deliver improved SLO management accuracy, better ...

Improve Data Pipelines Using Splunk Data Management

  Register Now   This Tech Talk will explore the pipeline management offerings Edge Processor and Ingest ...