Splunk Search

How do I combine two searches from 2 sourcetypes, each with a requestID field, without using join?

Cuyose
Builder

I'm still going through the myriad of answers relating to this, but as of yet, have not found my answer. I am doing something that I am sure almost everyone does. I have 2 searches from 2 sourcetypes, let's call them A and B. Each have a requestID field to join on. In the A type, I have nothing I want to report on, but that is where I want to put my conditions. B is where the reportable fields are. How can I avoid using the join, which is super slow?

index=app 
source=sourceA
host=hostA
sourcetype=A 
timeoutSvc=*
|rename requestId AS RequestId
|join RequestId [search  
index=app
host=hostB
source=sourceB
sourcetype=B
RequestId=*
]
|stats values(AgentUserName) 

sideview
SplunkTrust
SplunkTrust

It sounds like you want to use a subsearch to use your criteria on the B side to whittle down the total RequestId values being received. After that you want to just use stats to group things by the RequestId. Any other fields you need like AgentUserName, throw it in the stats (here I've just added it as an extra group by field because that generally makes the most sense for categorical fields. The final stats with the values(foo) clause might not be exactly what you want ultimately but it matches your question.

index=app host=hostB source=sourceB sourcetype=B
[  search index=app source=sourceA host=hostA sourcetype=A timeoutSvc=*
  | rename requestId AS RequestId
  | table RequestId
]
| stats count by AgentUserName RequestId 
| stats values(AgentUserName) 

Now if you want to carry something through to the end from the A side, like the timeoutSvc values, that's pretty simple. Here we still use the A criteria in the subsearch to whittle down the RequestId's, but then we feed this into a disjunction amounting to A OR B.

index=app ( host=hostB source=sourceB sourcetype=B ) OR ( source=sourceA host=hostA sourcetype=A timeoutSvc=* )
[  search index=app source=sourceA host=hostA sourcetype=A timeoutSvc=*
  | rename requestId AS RequestId
  | table RequestId
]
| stats count by AgentUserName RequestId timeoutSvc
| stats values(timeoutSvc) values(AgentUserName) 
0 Karma

woodcock
Esteemed Legend

You are not describing a join but rather using the results of one search to qualify another search. You do this with a subsearch that expands the search parameters for an outer search like this:

index=app host=hostB source=sourceB sourcetype=B [ search index=app source=sourceA host=hostA sourcetype=A timeoutSvc=* | rename requestId AS RequestId | fields RequestId ] | stats values(AgentUserName) 
0 Karma

woodcock
Esteemed Legend

Based on this clarification:

I need to get the requestID from a narrow conditioned search"B", then use that requestID to pull values from events from another search"A" where those values in A don't exist in B, and report as if I have access to all fields from both searches.

This should do it:

 index=app host=hostB source=sourceB sourcetype=B [ search index=app source=sourceA host=hostA sourcetype=A timeoutSvc=* | rename requestId AS RequestId | fields RequestId ] | stats values(AgentUserName)  | append [ search index=app source=sourceA host=hostA sourcetype=A timeoutSvc=* | rename requestId AS RequestId ] | stats values(*) AS * BY RequestId | stats values(AgentUserName) by timeoutSvc
0 Karma

Cuyose
Builder

What if I wanted to include the timeoutSvc from the "A" search along with the unique fields from the "A" search? anytime I try and do that it find no results.

like | stats values(AgentUserName) by timeoutSvc

0 Karma

woodcock
Esteemed Legend

In that case, you need a totally different kind of search so you need to ask a different question (and it is a join function that you desire). The most efficient way to do a join is to not do a join. You can do the join with stats values(*) AS * and then a different analysis with another stats like this:

index=app (host=hostB source=sourceB sourcetype=B) OR (source=sourceA host=hostA sourcetype=A) | eval normalizedRID=coalesce(requestId, RequestId) | stats values(*) AS * by normalizedRID | stats values(AgentUserName) by timeoutSvc
0 Karma

Cuyose
Builder

Unfortunately this will not work as both searches have a requestID, so the coalesce will always evaluate to a requestId making a search that should only return 10 results from the conditions placed on the "B" search, return all results from both searches.

Basically I want to :

Get the requestID from a narrow conditioned search"B", then use that requestID to pull values from events from another search"A" where those values in A don't exist in B, and report as if I have access to all fields from both searches.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...