Splunk Search

Proper use of join vs. append - correlating events by a common field

ft_kd02
Path Finder

Hi all,

I'm working to correlate a series of events. These events are all part of a logging process of a separate application. What seems to be common is a UUID. The following searches produce what I'd like individually:

for the first timestamp associated with the start of the process (there are multiple processes running for each UUID, but the ask is to extract the first / last timestamp)

index=###### sourcetype=### "process() - start"
| rex "^(?:[^ \n]* ){2}(?P<UUID>[^,]+)"
| eval startTime = strftime(_time, "%Y-%d-%m %H:%M:%S")
| stats earliest(startTime) as startingTimeStamp by UUID

index=###### sourcetype=### "process() - end"
| rex "^(?:[^ \n]* ){2}(?P<UUID>[^,]+)"
| eval endTime = strftime(_time, "%Y-%d-%m %H:%M:%S")
| stats latest(endTime) as endingTimeStamp by UUID

I'm trying to learn how to use join to connect these events by UUID. This SPL returns a table that has the earliest and latest of startTime, rather than the earliest(startTime) and latest(endTime). 

index=###### sourcetype=### "process() - start"
| rex "^(?:[^ \n]* ){2}(?P<UUID>[^,]+)"
| eval startTime = strftime(_time, "%Y-%d-%m %H:%M:%S")
| stats earliest(startTime) as startingTimeStamp by UUID
| join UUID type=left
[ search index=###### sourcetype=### "process() - end"
| rex "^(?:[^ \n]* ){2}(?P<UUID>[^,]+)"
| eval endTime = strftime(_time, "%Y-%d-%m %H:%M:%S")
| stats latest(endTime) as endingTimeStamp by UUID
]

Is join the appropriate function to use here? I'm reading in coalesce and append as well, but from my understanding append does not fit. Another piece is that the UUID is not a field extraction, but rather a regex, so I'm unsure how the join would be able to function if the subsearch has no knowledge of UUID until it runs and performs the rex. 

Labels (1)
0 Karma
1 Solution

somesoni2
Revered Legend

You can avoid join altogether for this use case:

index=###### sourcetype=### "process() - start" OR "process() - end"
| rex "^(?:[^ \n]* ){2}(?P<UUID>[^,]+)"
| eval startTime = if(searchmatch("process() - start"),_time,null())
| eval endTime = if(searchmatch("process() - end"),_time,null())
| stats min(startTime) as startingTimeStamp max(endTime) as endingTimestamp by UUID | eval startingTimeStamp=strftime(startingTimeStamp, "%Y-%d-%m %H:%M:%S")|  eval endingTimestamp=strftime(endingTimestamp, "%Y-%d-%m %H:%M:%S")

View solution in original post

somesoni2
Revered Legend

You can avoid join altogether for this use case:

index=###### sourcetype=### "process() - start" OR "process() - end"
| rex "^(?:[^ \n]* ){2}(?P<UUID>[^,]+)"
| eval startTime = if(searchmatch("process() - start"),_time,null())
| eval endTime = if(searchmatch("process() - end"),_time,null())
| stats min(startTime) as startingTimeStamp max(endTime) as endingTimestamp by UUID | eval startingTimeStamp=strftime(startingTimeStamp, "%Y-%d-%m %H:%M:%S")|  eval endingTimestamp=strftime(endingTimestamp, "%Y-%d-%m %H:%M:%S")

ft_kd02
Path Finder

Marking this as the solution because it is the correct answer to the question I asked. I misunderstood the problem and what I need to do is group the 4 events together for each process()-start and end. The 4 events, together, constitute a single api call. The issue I'm seeing is that there are no unique identifiers outside of _time() that can be used to group them together. Currently looking into the | transaction command as an alternative. 

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

.conf25 Global Broadcast: Don’t Miss a Moment

Hello Splunkers, .conf25 is only a click away.  Not able to make it to .conf25 in person? No worries, you can ...

Observe and Secure All Apps with Splunk

 Join Us for Our Next Tech Talk: Observe and Secure All Apps with SplunkAs organizations continue to innovate ...

What's New in Splunk Observability - August 2025

What's New We are excited to announce the latest enhancements to Splunk Observability Cloud as well as what is ...