Splunk Search

Searching data from two subsequent events

runiyal
Path Finder

I have a logfile like this -

 

2024-02-15 09:07:47,770 INFO  [com.mysite.core.app1.upload.FileUploadWebScript] [http-nio-8080-exec-202] The Upload Service /app1/service/site/upload failed in 0.124000 seconds, {comments=xxx-123, senderCompany=Company1, source=Web, title=Submitted via Site website, submitterType=Others, senderName=ROMAN , confirmationNumber=ND_50249-02152024, clmNumber=99900468430, name=ROAMN Claim # 99900468430 Invoice.pdf, contentType=Email}
2024-02-15 09:07:47,772 ERROR [org.springframework.extensions.webscripts.AbstractRuntime] [http-nio-8080-exec-202] Exception from executeScript: 0115100898 Duplicate Child Exception - ROAMN Claim # 99900468430 Invoice.pdf already exists in the location.
---
---
---
2024-02-15 09:41:16,762 INFO  [com.mysite.core.app1.upload.FileUploadWebScript] [http-nio-8080-exec-200] The Upload Service /app1/service/site/upload failed in 0.138000 seconds, {comments=yyy-789, senderCompany=Company2, source=Web, title=Submitted via Site website, submitterType=Public Adjuster, senderName=Tristian, confirmationNumber=ND_52233-02152024, clmNumber=99900470018, name=Tristian  CLAIM #99900470018 PACKAGE.pdf, contentType=Email}
2024-02-15 09:41:16,764 ERROR [org.springframework.extensions.webscripts.AbstractRuntime] [http-nio-8080-exec-200] Exception from executeScript: 0115100953 Document not found - Tristian  CLAIM #99900470018 PACKAGE.pdf 

 

We need to look at index=<myindex> "/alfresco/service/site/upload failed" and get the table with the following information.  

_timeclmNumberconfirmationNumbernameException
2024-02-15 09:07:4799900468430ND_50249-02152024ROMAN Claim # 99900468430 Invoice.pdf0115100898 Duplicate Child Exception - ROAMN Claim # 99900468430 Invoice.pdf already exists in the location
2024-02-15 09:41:1699900470018ND_52233-02152024Tristian CLAIM #99900470018 PACKAGE.pdf0115100953 Document not found - Tristian CLAIM #99900470018 PACKAGE.pdf

 

Exception is in another event line in logfile but just after the line from where to get first 4 metadata. Both of the rows/ events in the logs have sessionID in common and can have DOCNAME also in common but SessionID can have multiple transactions so can have different name. 

I created following script for this purpose but its providing different DocName  -

 

(index="myindex" "/app1/service/site/upload failed" AND "source=Web" AND "confirmationNumber=ND_*") OR 
(index="myindex" "Exception from executeScript")
| rex "clmNumber=(?<ClaimNumber>[^,]+)" 
| rex "confirmationNumber=(?<SubmissionNumber>[^},]+)" 
| rex "contentType=(?<ContentType>[^},]+)" 
| rex "name=(?<DocName>[^,]+)" 
| rex "(?<SessionID>\[http-nio-8080-exec-\d+\])" 
| eval EventType=if(match(_raw, "Exception from executeScript"), "Exception", "Upload Failure")
| eventstats first(EventType) as first_EventType by SessionID
| where EventType="Upload Failure"
| join type=outer SessionID [
    search index="myindex" "Exception from executeScript"
    | rex "Exception from executeScript: (?<Exception>[^:]+)"
    | rex "(?<SessionID>\[http-nio-8080-exec-\d+\])"
    | rex "(?<ExceptionDocName>.+\.pdf)"
    | eval EventType="Exception"
    | eventstats first(EventType) as first_EventType by SessionID
] 
| where EventType="Exception" OR isnull(Exception)
| table _time, ClaimNumber, SubmissionNumber, ContentType, DocName, Exception
| sort _time desc ClaimNumber

 

Here is the result that I got -

_timeclmNumberconfirmationNumbernameException
2024-02-15 09:07:4799900468430ND_50249-02152024ROMAN Claim # 99900468430 Invoice.pdf0115105149 Duplicate Child Exception - Rakesh lease 4 already exists in the location.
2024-02-15 09:41:1699900470018ND_52233-02152024Tristian CLAIM #99900470018 PACKAGE.pdf0115105128 Duplicate Child Exception - Combined 4 Point signed Ramesh 399 Coral Island. disk 3 already exists in the location.

 

So, although I am able to get first four metadata in the table correctly, but the exception is coming from another event in the log with same sessionID I believe.

How can we fix the script to provide the expected result?

Thanks in Advance.

 

Labels (6)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

First, thank you for clearly illustrating input data and desired output.  Note that join is a performance killer and best avoided; in this case it is an overkill.

If I decipher your requirement from the complex SPL correctly, all you want is a correlation between INFO and ERROR logs to output exceptions correlated with failed claim, file, etc.  Whereas it is not difficult to extract claim number from both types of logs given the illustrated format, an easier correlation field is SessionID because they appear in both types in the exact same form.

Additionally, there should be no need to extract clmNumber and confirmationNumber because they are automatically extracted.  the name field is garbled because of unquoted white spaces.

This is a simpler search that should satisfy your requirement:

 

index="myindex" ("/app1/service/site/upload failed" AND "source=Web" AND "confirmationNumber=ND_*")
 OR ("Exception from executeScript")
| rex "\bname=(?<name>[^,]+)"
```| rex "clmNumber=(?<ClaimNumber>[^,]+)" 
| rex "confirmationNumber=(?<SubmissionNumber>[^},]+)"
| rex "contentType=(?<ContentType>[^},]+)" ```
| rex "(?<SessionID>\[http-nio-8080-exec-\d+\])"
| rex "Exception from executeScript: (?<Exception>[^:]+)"
| fields clmNumber confirmationNumber name Exception SessionID
| stats min(_time) as _time values(*) as * by SessionID

 

Your sample logs should give

SessionID_timeExceptionclmNumberconfirmationNumbername
[http-nio-8080-exec-200]2024-02-15 09:41:16.7620115100953 Document not found - Tristian CLAIM #99900470018 PACKAGE.pdf99900470018ND_52233-02152024Tristian CLAIM #99900470018 PACKAGE.pdf
[http-nio-8080-exec-202]2024-02-15 09:07:47.7690115100898 Duplicate Child Exception - ROAMN Claim # 99900468430 Invoice.pdf already exists in the location.99900468430ND_50249-02152024ROAMN Claim # 99900468430 Invoice.pdf

Of course you can remove SessionID from display and rearrange field order.

You can play with the following emulation and compare with real data

 

| makeresults
| eval data = split("2024-02-15 09:07:47,770 INFO  [com.mysite.core.app1.upload.FileUploadWebScript] [http-nio-8080-exec-202] The Upload Service /app1/service/site/upload failed in 0.124000 seconds, {comments=xxx-123, senderCompany=Company1, source=Web, title=Submitted via Site website, submitterType=Others, senderName=ROMAN , confirmationNumber=ND_50249-02152024, clmNumber=99900468430, name=ROAMN Claim # 99900468430 Invoice.pdf, contentType=Email}
2024-02-15 09:07:47,772 ERROR [org.springframework.extensions.webscripts.AbstractRuntime] [http-nio-8080-exec-202] Exception from executeScript: 0115100898 Duplicate Child Exception - ROAMN Claim # 99900468430 Invoice.pdf already exists in the location.
---
---
---
2024-02-15 09:41:16,762 INFO  [com.mysite.core.app1.upload.FileUploadWebScript] [http-nio-8080-exec-200] The Upload Service /app1/service/site/upload failed in 0.138000 seconds, {comments=yyy-789, senderCompany=Company2, source=Web, title=Submitted via Site website, submitterType=Public Adjuster, senderName=Tristian, confirmationNumber=ND_52233-02152024, clmNumber=99900470018, name=Tristian  CLAIM #99900470018 PACKAGE.pdf, contentType=Email}
2024-02-15 09:41:16,764 ERROR [org.springframework.extensions.webscripts.AbstractRuntime] [http-nio-8080-exec-200] Exception from executeScript: 0115100953 Document not found - Tristian  CLAIM #99900470018 PACKAGE.pdf", "
")
| mvexpand data
| rename data AS _raw
| rex "^(?<_time>\S+ \S+)"
| eval _time = strptime(_time, "%F %T,%3N")
| extract
``` the above emulates
(index="myindex" "/app1/service/site/upload failed" AND "source=Web" AND "confirmationNumber=ND_*") OR 
(index="myindex" "Exception from executeScript")
```
| rex "\bname=(?<name>[^,]+)"
```| rex "clmNumber=(?<ClaimNumber>[^,]+)" 
| rex "confirmationNumber=(?<SubmissionNumber>[^},]+)"
| rex "contentType=(?<ContentType>[^},]+)" ```
| rex "(?<SessionID>\[http-nio-8080-exec-\d+\])"
| rex "Exception from executeScript: (?<Exception>[^:]+)"
| fields clmNumber confirmationNumber name Exception SessionID
| stats min(_time) as _time values(*) as * by SessionID

 

 

Tags (1)
0 Karma

runiyal
Path Finder

Result is coming like this for the first query.....

SessionID_timeExceptionclmNumberconfirmationNumbername
[http-nio-8080-exec-101]2024-02-15 00:06:38.457

0115100018 Could not match parameter list [names, keep] to an operation.
org.springframework.extensions.webscripts.WebScriptException
0115100062 Could not find document 20231009_00064.TIF in suspense.
org.springframework.extensions.webscripts.WebScriptException
0115100104 Could not find document 20240103_00065.TIF in suspense.
org.springframework.extensions.webscripts.WebScriptException
0115100168 Duplicate Child Exception - 02142024_17C0_Email.pdf already exists in the location.
org.springframework.extensions.webscripts.WebScriptException
0115100375 Duplicate Child Exception - NB Doc Form 313652.8.24 already exists in the location.
org.springframework.extensions.webscripts.WebScriptException

---

(Many More)

   

 

0 Karma

runiyal
Path Finder

BTW when the first query runs, it feels like it is going to give data as it presented by query 2 (| makeresults) for a sub second and then it mixes up and provides all the jumbled up data without anything on last three columns. Not sure if this information helps.

0 Karma

runiyal
Path Finder

Thanks a lot for your reply Yuanliu.

When I tried to run the below code I get very skwed result. Session ID, and Time columns gets populated. For Exception, all exception for that "day" shows up in in row itself (Since I am running a day's worth of report) whether its related to "confirmationNumber=ND_*" or not. Rest of the three fieds are empty.

 

index="myindex" ("/app1/service/site/upload failed" AND "source=Web" AND "confirmationNumber=ND_*")
 OR ("Exception from executeScript")
| rex "\bname=(?<name>[^,]+)"
```| rex "clmNumber=(?<ClaimNumber>[^,]+)" 
| rex "confirmationNumber=(?<SubmissionNumber>[^},]+)"
| rex "contentType=(?<ContentType>[^},]+)" ```
| rex "(?<SessionID>\[http-nio-8080-exec-\d+\])"
| rex "Exception from executeScript: (?<Exception>[^:]+)"
| fields clmNumber confirmationNumber name Exception SessionID
| stats min(_time) as _time values(*) as * by SessionID

 

 

Secondly, I have data that might have same sessionID but different dataset, I am not able to see _time for the second transaction for same sessionID. Here is the sample data -

 

| makeresults
| eval data = split("2024-02-15 09:07:47,770 INFO  [com.mysite.core.app1.upload.FileUploadWebScript] [http-nio-8080-exec-202] The Upload Service /app1/service/citizens/upload failed in 0.124000 seconds, {comments=xxx-123, senderCompany=Company1, source=Web, title=Submitted via Site website, submitterType=Others, senderName=ROMAN , confirmationNumber=ND_50249-02152024, clmNumber=99900468430, name=ROAMN Claim # 99900468430 Invoice.pdf, contentType=Email}
2024-02-15 09:07:47,772 ERROR [org.springframework.extensions.webscripts.AbstractRuntime] [http-nio-8080-exec-202] Exception from executeScript: 0115100898 Duplicate Child Exception - ROAMN Claim # 99900468430 Invoice.pdf already exists in the location.
2024-02-15 09:10:47,770 INFO  [com.mysite.core.app1.upload.FileUploadWebScript] [http-nio-8080-exec-202] The Upload Service /app1/service/citizens/upload failed in 0.124000 seconds, {comments=xxx-123, senderCompany=Company1, source=Web, title=Submitted via Site website, submitterType=Others, senderName=Bob , confirmationNumber=ND_55555-02152024, clmNumber=99900468999, name=Bob Claim # 99900468999 Invoice.pdf, contentType=Email}
2024-02-15 09:10:48,772 ERROR [org.springframework.extensions.webscripts.AbstractRuntime] [http-nio-8080-exec-202] Exception from executeScript: 0115101000 Document not found - Bob Claim # 99900468999 Invoice.pdf already exists in the location.
2024-02-15 09:41:16,762 INFO  [com.mysite.core.app1.upload.FileUploadWebScript] [http-nio-8080-exec-200] The Upload Service /app1/service/citizens/upload failed in 0.138000 seconds, {comments=yyy-789, senderCompany=Company2, source=Web, title=Submitted via Site website, submitterType=Public Adjuster, senderName=Tristian, confirmationNumber=ND_52233-02152024, clmNumber=99900470018, name=Tristian  CLAIM #99900470018 PACKAGE.pdf, contentType=Email}
2024-02-15 09:41:16,764 ERROR [org.springframework.extensions.webscripts.AbstractRuntime] [http-nio-8080-exec-200] Exception from executeScript: 0115100953 Document not found - Tristian  CLAIM #99900470018 PACKAGE.pdf", "
")

 

and here is the result -

SessionID_timeExceptionclmNumberconfirmationNumbername
[http-nio-8080-exec-200]2024-02-15 09:41:16.7620115100953 Document not found - Tristian CLAIM #99900470018 PACKAGE.pdf99900470018ND_52233-02152024Tristian CLAIM #99900470018 PACKAGE.pdf
[http-nio-8080-exec-202]2024-02-15 09:07:47.769
0115100898 Duplicate Child Exception - ROAMN Claim # 99900468430 Invoice.pdf already exists in the location.
0115101000 Document not found - Bob Claim # 99900468999 Invoice.pdf already exists in the location.
99900468430
99900468999
ND_50249-02152024
ND_55555-02152024
Bob Claim # 99900468999 Invoice.pdf
ROAMN Claim # 99900468430 Invoice.pdf

How can we fix the first query so that it provides data for all columns correctly?

Thanks in advance for your time!

Tags (1)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

Thank you for providing the emulation!  It is really important to illustrate data characteristics when dealing with data analytics.  I made the assumption that each session would only handle one claim.  If that is not the case, we'll have to extract claim number for correlation.  There are many ways to do this. Because claim number is always embedded in the file name, I will show the simplest that applies to both INFO and ERROR. (An alternative is to simply use file name for correlation.)  So

 

(index="myindex" "/app1/service/site/upload failed" AND "source=Web" AND "confirmationNumber=ND_*") OR 
(index="myindex" "Exception from executeScript")
| rex "\bname=(?<name>[^,]+)"
| rex "(?i) claim # *(?<claimNumber>\S+)"
| rex "(?<SessionID>\[http-nio-8080-exec-\d+\])"
| rex "Exception from executeScript: (?<Exception>[^:]+)"
| fields claimNumber confirmationNumber name Exception
| stats min(_time) as _time values(*) as * by claimNumber

 

 Here is full emulation and result

 

| makeresults
| eval data = split("2024-02-15 09:07:47,770 INFO  [com.mysite.core.app1.upload.FileUploadWebScript] [http-nio-8080-exec-202] The Upload Service /app1/service/citizens/upload failed in 0.124000 seconds, {comments=xxx-123, senderCompany=Company1, source=Web, title=Submitted via Site website, submitterType=Others, senderName=ROMAN , confirmationNumber=ND_50249-02152024, clmNumber=99900468430, name=ROAMN Claim # 99900468430 Invoice.pdf, contentType=Email}
2024-02-15 09:07:47,772 ERROR [org.springframework.extensions.webscripts.AbstractRuntime] [http-nio-8080-exec-202] Exception from executeScript: 0115100898 Duplicate Child Exception - ROAMN Claim # 99900468430 Invoice.pdf already exists in the location.
2024-02-15 09:10:47,770 INFO  [com.mysite.core.app1.upload.FileUploadWebScript] [http-nio-8080-exec-202] The Upload Service /app1/service/citizens/upload failed in 0.124000 seconds, {comments=xxx-123, senderCompany=Company1, source=Web, title=Submitted via Site website, submitterType=Others, senderName=Bob , confirmationNumber=ND_55555-02152024, clmNumber=99900468999, name=Bob Claim # 99900468999 Invoice.pdf, contentType=Email}
2024-02-15 09:10:48,772 ERROR [org.springframework.extensions.webscripts.AbstractRuntime] [http-nio-8080-exec-202] Exception from executeScript: 0115101000 Document not found - Bob Claim # 99900468999 Invoice.pdf already exists in the location.
2024-02-15 09:41:16,762 INFO  [com.mysite.core.app1.upload.FileUploadWebScript] [http-nio-8080-exec-200] The Upload Service /app1/service/citizens/upload failed in 0.138000 seconds, {comments=yyy-789, senderCompany=Company2, source=Web, title=Submitted via Site website, submitterType=Public Adjuster, senderName=Tristian, confirmationNumber=ND_52233-02152024, clmNumber=99900470018, name=Tristian  CLAIM #99900470018 PACKAGE.pdf, contentType=Email}
2024-02-15 09:41:16,764 ERROR [org.springframework.extensions.webscripts.AbstractRuntime] [http-nio-8080-exec-200] Exception from executeScript: 0115100953 Document not found - Tristian  CLAIM #99900470018 PACKAGE.pdf", "
")
| mvexpand data
| rename data AS _raw
| rex "^(?<_time>\S+ \S+)"
| eval _time = strptime(_time, "%F %T,%3N")
| extract
``` the above emulates
(index="myindex" "/app1/service/site/upload failed" AND "source=Web" AND "confirmationNumber=ND_*") OR 
(index="myindex" "Exception from executeScript")
```
| rex "\bname=(?<name>[^,]+)"
| rex "(?i) claim # *(?<claimNumber>\S+)"
```| rex "clmNumber=(?<ClaimNumber>[^,]+)" 
| rex "confirmationNumber=(?<SubmissionNumber>[^},]+)"
| rex "contentType=(?<ContentType>[^},]+)" ```
| rex "(?<SessionID>\[http-nio-8080-exec-\d+\])"
| rex "Exception from executeScript: (?<Exception>[^:]+)"
| fields claimNumber confirmationNumber name Exception
| stats min(_time) as _time values(*) as * by claimNumber

 

claimNumber_timeExceptionconfirmationNumbername
999004684302024-02-15 09:07:47.7690115100898 Duplicate Child Exception - ROAMN Claim # 99900468430 Invoice.pdf already exists in the location.ND_50249-02152024ROAMN Claim # 99900468430 Invoice.pdf
999004689992024-02-15 09:10:47.7690115101000 Document not found - Bob Claim # 99900468999 Invoice.pdf already exists in the location.ND_55555-02152024Bob Claim # 99900468999 Invoice.pdf
999004700182024-02-15 09:41:16.7620115100953 Document not found - Tristian CLAIM #99900470018 PACKAGE.pdfND_52233-02152024Tristian CLAIM #99900470018 PACKAGE.pdf
0 Karma

runiyal
Path Finder

Thanks Yuanliu,

This is working but not completely. There are 75 records that I should get in the resilt get as I am getting 75 rows if I just search for

index="myindex" "/app1/service/site/upload failed" AND "source=Web" AND "confirmationNumber=ND_*"

But when I update the script to the above provided then I am getting only 23 rows.

Going back to the original requirement -

First the script needs to search all the records that it can get by providing -

index="myindex" "/app1/service/site/upload failed" AND "source=Web" AND "confirmationNumber=ND_*"

Fetch _time, clmNumber, confirmationNumber, and name from that event in the table (4 columns).
Then check the second line [for same sessionid] for an exception (Exception from executeScript) and provide whatever is after it as a fifth column in the table.

May be I was not clear on the requirements earlier.

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Now we are deep into the weeds of actual data.  The number of rows is dependent only on how many unique claimNumber regex "(?i) claim # *(?<claimNumber>\S+)" extracts from both source filters.  A meaningful test would be

(index="myindex" "/app1/service/site/upload failed" AND "source=Web" AND "confirmationNumber=ND_*")
| rex "(?i) claim # *(?<claimNumber>\S+)"
| stats dc(clmNumber) as clmCount dc(claimNumber)claimCount

Do they give 23?  75?  one give 75, one 23? (According to your description, claimCount should be 23.)  If the two counts are equal, there is nothing to change.

If you get different counts for clmNumber and claimNumber, you can do another test

(index="myindex" "/app1/service/site/upload failed" AND "source=Web" AND "confirmationNumber=ND_*")
| rex "(?i) claim # *(?<claimNumber>\S+)"
| table _time clmNumber claimNumber _raw

Then, you need to refine the regex.  Post sample data for which claimNumber is not extracted if you need help with regex.

0 Karma
Get Updates on the Splunk Community!

New in Observability - Improvements to Custom Metrics SLOs, Log Observer Connect & ...

The latest enhancements to the Splunk observability portfolio deliver improved SLO management accuracy, better ...

Improve Data Pipelines Using Splunk Data Management

  Register Now   This Tech Talk will explore the pipeline management offerings Edge Processor and Ingest ...

3-2-1 Go! How Fast Can You Debug Microservices with Observability Cloud?

Register Join this Tech Talk to learn how unique features like Service Centric Views, Tag Spotlight, and ...