Splunk Search

How do I join two searches with common field?

ellstream44
Explorer

I have one search that checks for entries with duration >= 50000 (responses for requests)

source="abc.log" | regex "\"duration\" : ([5-9][0-9]{4}|[0-9]{6,})"

The search returns results with JSON format:

   ...
  "duration" : 60026,
  "correlationId" : "be225a47972b95f5",
   ... 

I want to show the connected request for that response in the same result set. So I would like to find results where the correlation id matches.

Something like:

(source="abc.log" | regex "\"duration\" : ([5-9][0-9]{4}|[0-9]{6,})" | eval correlationId)

OR

(source="abc.log" | correlationId)

The second part of the "OR" statement would return the requests that ended up with 50s+ response time.
Could anyone help me with the syntax to achieve this?

Thanks!

Labels (3)
0 Karma

richgalloway
SplunkTrust
SplunkTrust

This may work for you. It uses rex to extract fields from the events rather regex, which just filters events. The where command does the filtering.

source="abc.log" 
| rex "\"duration\" : (?<duration>\d+)"
| rex "\"correlationId\" : \"(<correlationId>[^\"]+)"
| where duration > 50000
| table correlationId duration
---
If this reply helps you, an upvote would be appreciated.

ellstream44
Explorer

Thanks for the response!
The rex for duration and where statement worked great, however for the correlationId part, i get error:

"Error in 'rex' command: The regex '"correlation" : "([^"]+)' does not extract anything. It should specify at least one named group. Format: (?...)."

I thought maybe ? was missing, so I tried this:
| rex "\"correlationId\" : \"(?[^\"]+)"

And I got hit for both correlation and duration for example:
correlation duration
c046ed74a4fb38a1 360615
b0689d6af87fb231 541538

However, I would like to see all the entries with a certain correlationId, so I can find out which request resulted in a response with duration 50s+, so i removed the "table" part, and clicked "Event"

So with this query:
source="abc.log"
| rex "\"duration\" : (?\d+)"
| rex "\"correlationId\" : \"(?[^\"]+)"
| where duration > 50000

I found entries, but only the "response" parts. When I added c046ed74a4fb38a1 to the query above, I found only one entry with that correlationId, but If i searched like this:

85a54844766753b0
source="abc.log"

I found the two searches, both the request and the response.

My goal is to have a search where I would have a list of entries, where each entry contains both the request AND the response for when the response was 50+seconds 🙂

0 Karma

richgalloway
SplunkTrust
SplunkTrust

My apologies for the typo. I'm glad you figured it out.

I'm not following what you want to do. What does "85a54844766753b0" stand for? What is the difference between a request and a response?

---
If this reply helps you, an upvote would be appreciated.

ellstream44
Explorer

Sorry for being unclear, an example request with response (entries which i can find with my searches):
85a54844766753b0 is a correlationId

Request entry:
2020-06-02 19:06:27,463 INFO com.Logger {
"origin" : "remote",
"type" : "request",
"correlationId" : "85a54844766753b0",
"protocol" : "HTTP/1.1",
"method" : "PUT",
"uri" : "http://myuri.net:4949/orders/123456/status",
"body" : {"status":"AUTO_CANCELED"}
}

Response entry:
2020-06-02 19:15:28,808 INFO com.Logger {
"origin" : "local",
"type" : "response",
"correlationId" : "85a54844766753b0",
"duration" : 541344,
"protocol" : "HTTP/1.1",
"status" : 204
}

So this is an example where a request with body {"status":"AUTO_CANCELED"} resulted in a response with over 50s response time. This is the information I need to identify which requests are taking a long time to finish since I have some load issues

So now with this query:
source="abc.log"
| rex "\"duration\" : (?\d+)"
| rex "\"correlationId\" : \"(?[^\"]+)"
| where duration > 50000

I can find entries in the event which have duration > 50000. For each entry in the event list, I would also like to see the connected request (not only response) so that I can figure out which request was sent in this case 🙂

What I currently do is that I have to copy the correlationId, make a new search for
source="abc.log"
\"correlationId\" : \"85a54844766753b0\"

To be able to see the actual request that resulted in a high duration for the response 🙂

0 Karma

richgalloway
SplunkTrust
SplunkTrust

I think I understand now. We need to match up events by correlationId. There are a few ways to do that, but the best is usually stats.

source="abc.log"
| rex "\"duration\" : (?\d+)"
| rex "\"correlationId\" : \"(?[^\"]+)"
| stats values(*) as * by correlationId
| where duration > 50000
---
If this reply helps you, an upvote would be appreciated.
0 Karma

ellstream44
Explorer

I tried:
source="abc.log"
| rex "\"duration\" : (?\d+)"
| rex "\"correlationId\" : \"(?[^\"]+)"
| stats values(*) as * by correlationId
| where duration > 50000

But then I got a hit on every single event
It was basically the same search as searching for
source="abc.log"

So I tried to put the "where" statement before the "duration" statement, as such:
source="abc.log"
| rex "\"duration\" : (?\d+)"
| rex "\"correlationId\" : \"(?[^\"]+)"
| where duration > 50000
| stats values(*) as * by correlationId

Now the where statement for duration worked, however even I look at the event tab, I still only find matches for the "response" parts 😕 So I do not find the "requests" parts unfortunately. Perhaps this is because the field duration does not exist in the request part?

0 Karma

richgalloway
SplunkTrust
SplunkTrust

I intentionally put where after stats because request events do not have a duration field. The stats command matches up request and response by correlation ID so each resulting event has a duration.

---
If this reply helps you, an upvote would be appreciated.
0 Karma

ellstream44
Explorer

Ok but with this:
source="abc.log"
| rex "\"duration\" : (?\d+)"
| rex "\"correlationId\" : \"(?[^\"]+)"
| stats values(*) as * by correlationId
| where duration > 50000

I got hit on every single "response" in the "abc.log"
even where duration is 16 etc

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Please use the code (101010) button to format your SPL.

What do you get using this? Does the duration field display properly?

source="abc.log"
| rex "\"duration\" : (?\d+)"
| rex "\"correlationId\" : \"(?[^\"]+)"
| stats values() as by correlationId
| table correlationID duration
---
If this reply helps you, an upvote would be appreciated.
0 Karma

ellstream44
Explorer
source="abc.log" 
 | rex "\"duration\" : (?<duration>\d+)"
 | rex "\"correlationId\" : \"(?<correlationId>[^\"]+)"
 | stats values(*) as * by correlationId
 | table correlationId duration

This query found several hits in the Statistics view, many entries had 1 correlationId and 2 durations

Example:

correlationId: 80005e83861c03b7

duration: both "105" and also "protocol"

I do not know what the protocol part comes from

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Looks like a parsing problem. The rex command that extracts the duration field is a little off. Do you have an example event that sets duration to "protocol"?

---
If this reply helps you, an upvote would be appreciated.
0 Karma

ellstream44
Explorer

Seems like it, I get hits for posts that is not containing "duration" at all

Example:

2020-06-04 08:41:53,995 INFO com.Logger {
"origin" : "remote",
"type" : "request",
"correlation" : "84e110da7e8b37e6",
"protocol" : "HTTP/1.1",
"method" : "GET",
"uri" : "http://myuri/orders/30070834",
"body" : ""
}

source="/opt/apps/spring-boot/customer-order-data-bowmore-prod/log/communication.log" 
  | rex "\"duration\" : (?<duration>\d+)"
  | rex "\"correlation\" : \"(?<correlation>[^\"]+)"
  | search duration=protocol
0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!