Splunk Search

Proper use of join vs. append - correlating events by a common field

ft_kd02
Path Finder

Hi all,

I'm working to correlate a series of events. These events are all part of a logging process of a separate application. What seems to be common is a UUID. The following searches produce what I'd like individually:

for the first timestamp associated with the start of the process (there are multiple processes running for each UUID, but the ask is to extract the first / last timestamp)

index=###### sourcetype=### "process() - start"
| rex "^(?:[^ \n]* ){2}(?P<UUID>[^,]+)"
| eval startTime = strftime(_time, "%Y-%d-%m %H:%M:%S")
| stats earliest(startTime) as startingTimeStamp by UUID

index=###### sourcetype=### "process() - end"
| rex "^(?:[^ \n]* ){2}(?P<UUID>[^,]+)"
| eval endTime = strftime(_time, "%Y-%d-%m %H:%M:%S")
| stats latest(endTime) as endingTimeStamp by UUID

I'm trying to learn how to use join to connect these events by UUID. This SPL returns a table that has the earliest and latest of startTime, rather than the earliest(startTime) and latest(endTime). 

index=###### sourcetype=### "process() - start"
| rex "^(?:[^ \n]* ){2}(?P<UUID>[^,]+)"
| eval startTime = strftime(_time, "%Y-%d-%m %H:%M:%S")
| stats earliest(startTime) as startingTimeStamp by UUID
| join UUID type=left
[ search index=###### sourcetype=### "process() - end"
| rex "^(?:[^ \n]* ){2}(?P<UUID>[^,]+)"
| eval endTime = strftime(_time, "%Y-%d-%m %H:%M:%S")
| stats latest(endTime) as endingTimeStamp by UUID
]

Is join the appropriate function to use here? I'm reading in coalesce and append as well, but from my understanding append does not fit. Another piece is that the UUID is not a field extraction, but rather a regex, so I'm unsure how the join would be able to function if the subsearch has no knowledge of UUID until it runs and performs the rex. 

Labels (1)
0 Karma
1 Solution

somesoni2
SplunkTrust
SplunkTrust

You can avoid join altogether for this use case:

index=###### sourcetype=### "process() - start" OR "process() - end"
| rex "^(?:[^ \n]* ){2}(?P<UUID>[^,]+)"
| eval startTime = if(searchmatch("process() - start"),_time,null())
| eval endTime = if(searchmatch("process() - end"),_time,null())
| stats min(startTime) as startingTimeStamp max(endTime) as endingTimestamp by UUID | eval startingTimeStamp=strftime(startingTimeStamp, "%Y-%d-%m %H:%M:%S")|  eval endingTimestamp=strftime(endingTimestamp, "%Y-%d-%m %H:%M:%S")

View solution in original post

somesoni2
SplunkTrust
SplunkTrust

You can avoid join altogether for this use case:

index=###### sourcetype=### "process() - start" OR "process() - end"
| rex "^(?:[^ \n]* ){2}(?P<UUID>[^,]+)"
| eval startTime = if(searchmatch("process() - start"),_time,null())
| eval endTime = if(searchmatch("process() - end"),_time,null())
| stats min(startTime) as startingTimeStamp max(endTime) as endingTimestamp by UUID | eval startingTimeStamp=strftime(startingTimeStamp, "%Y-%d-%m %H:%M:%S")|  eval endingTimestamp=strftime(endingTimestamp, "%Y-%d-%m %H:%M:%S")

ft_kd02
Path Finder

Marking this as the solution because it is the correct answer to the question I asked. I misunderstood the problem and what I need to do is group the 4 events together for each process()-start and end. The 4 events, together, constitute a single api call. The issue I'm seeing is that there are no unique identifiers outside of _time() that can be used to group them together. Currently looking into the | transaction command as an alternative. 

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...