Splunk Search

Proper use of join vs. append - correlating events by a common field

ft_kd02
Path Finder

Hi all,

I'm working to correlate a series of events. These events are all part of a logging process of a separate application. What seems to be common is a UUID. The following searches produce what I'd like individually:

for the first timestamp associated with the start of the process (there are multiple processes running for each UUID, but the ask is to extract the first / last timestamp)

index=###### sourcetype=### "process() - start"
| rex "^(?:[^ \n]* ){2}(?P<UUID>[^,]+)"
| eval startTime = strftime(_time, "%Y-%d-%m %H:%M:%S")
| stats earliest(startTime) as startingTimeStamp by UUID

index=###### sourcetype=### "process() - end"
| rex "^(?:[^ \n]* ){2}(?P<UUID>[^,]+)"
| eval endTime = strftime(_time, "%Y-%d-%m %H:%M:%S")
| stats latest(endTime) as endingTimeStamp by UUID

I'm trying to learn how to use join to connect these events by UUID. This SPL returns a table that has the earliest and latest of startTime, rather than the earliest(startTime) and latest(endTime). 

index=###### sourcetype=### "process() - start"
| rex "^(?:[^ \n]* ){2}(?P<UUID>[^,]+)"
| eval startTime = strftime(_time, "%Y-%d-%m %H:%M:%S")
| stats earliest(startTime) as startingTimeStamp by UUID
| join UUID type=left
[ search index=###### sourcetype=### "process() - end"
| rex "^(?:[^ \n]* ){2}(?P<UUID>[^,]+)"
| eval endTime = strftime(_time, "%Y-%d-%m %H:%M:%S")
| stats latest(endTime) as endingTimeStamp by UUID
]

Is join the appropriate function to use here? I'm reading in coalesce and append as well, but from my understanding append does not fit. Another piece is that the UUID is not a field extraction, but rather a regex, so I'm unsure how the join would be able to function if the subsearch has no knowledge of UUID until it runs and performs the rex. 

Labels (1)
0 Karma
1 Solution

somesoni2
Revered Legend

You can avoid join altogether for this use case:

index=###### sourcetype=### "process() - start" OR "process() - end"
| rex "^(?:[^ \n]* ){2}(?P<UUID>[^,]+)"
| eval startTime = if(searchmatch("process() - start"),_time,null())
| eval endTime = if(searchmatch("process() - end"),_time,null())
| stats min(startTime) as startingTimeStamp max(endTime) as endingTimestamp by UUID | eval startingTimeStamp=strftime(startingTimeStamp, "%Y-%d-%m %H:%M:%S")|  eval endingTimestamp=strftime(endingTimestamp, "%Y-%d-%m %H:%M:%S")

View solution in original post

somesoni2
Revered Legend

You can avoid join altogether for this use case:

index=###### sourcetype=### "process() - start" OR "process() - end"
| rex "^(?:[^ \n]* ){2}(?P<UUID>[^,]+)"
| eval startTime = if(searchmatch("process() - start"),_time,null())
| eval endTime = if(searchmatch("process() - end"),_time,null())
| stats min(startTime) as startingTimeStamp max(endTime) as endingTimestamp by UUID | eval startingTimeStamp=strftime(startingTimeStamp, "%Y-%d-%m %H:%M:%S")|  eval endingTimestamp=strftime(endingTimestamp, "%Y-%d-%m %H:%M:%S")

ft_kd02
Path Finder

Marking this as the solution because it is the correct answer to the question I asked. I misunderstood the problem and what I need to do is group the 4 events together for each process()-start and end. The 4 events, together, constitute a single api call. The issue I'm seeing is that there are no unique identifiers outside of _time() that can be used to group them together. Currently looking into the | transaction command as an alternative. 

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...