Splunk Search

Need event extraction

kranthimutyala2
Engager

I want to extract all the key value pairs from this event  dynamically
Can someone help with the query
 

INFO 2024-04-29 16:30:08,456 [[MuleRuntime].uber.05: [ct-fin-abc-apps-papi-v1-uw2-ut].abc-apps-schedular-main-flow.BLOCKING @68f82333] com.sfdc.it.ei.mule4.eilog.EILog: {"worker":"0","region":"us-ne-2","applicationName":"ct-fin-abc-apps-papi-v1-uw2-ut","applicationVersion":"1.0.7","applicationType":"PAPI","environment":"ct-app-UAT","domain":"CR C4E","x-transaction-id":"xxxx-e691-xx-91bf-xxx","tx.flow":"read-input-files-sub-flow","tx.fileName":"implementation.xml","txlineNumber":"71","stage":"MILESTONE","status":"SUCCESS","endpointSystem":"","jsonRecord":"{\n \"Task Name\": \"Cash Apps PAPI\",\n \"Action Name\": \"Read Input Files GDrive Start\",\n \"Run Reference\": \"xx-0645-11ef-xx-xx\",\n \"Record Id Type\": \"Invoice\"\n}","detailText":"Start Reading Input Files from G drive","businessRecordId":"","businessRecordType":"","batchSize":"0","totalRecords":"0","remainingRetries":"0","timestamp":"2024-04-29 16:30:08.455","threadName":"[MuleRuntime].uber.05: [ct-fin-aps-apps-papi-v1-uw2-ut].abc-apps-schedular-main-flow.BLOCKING @68f82333"}

Tags (2)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

A more fundamental problem is that by insisting  on using regex for this log, you are treating structured JSON log eilog.EILOG as text string.  It is NOT.   It is much more robust to use Splunk's built-in, QA tested capabilities to handle structured data.  Have you tried my suggestion

| rex "eilog.EILog:\s*(?<eilog>{.+})"
| spath input=eilog
| spath input=jsonRecord

and not getting all data fields in this JSON?   As I illustrated previously, this should give you

Task NameCash Apps PAPI

along with dozens of other key-value pairs.

View solution in original post

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Do you mean to obtain something like this?

 
field namefield value
Action NameRead Input Files GDrive Start
Record Id TypeInvoice
Run Referencexx-0645-11ef-xx-xx
Task NameCash Apps PAPI
applicationNamect-fin-abc-apps-papi-v1-uw2-ut
applicationTypePAPI
applicationVersion1.0.7
batchSize0
businessRecordId 
businessRecordType 
detailTextStart Reading Input Files from G drive
domainCR C4E
endpointSystem 
environmentct-app-UAT
regionus-ne-2
remainingRetries0
stageMILESTONE
statusSUCCESS
threadName[MuleRuntime].uber.05: [ct-fin-aps-apps-papi-v1-uw2-ut].abc-apps-schedular-main-flow.BLOCKING @68f82333
timestamp2024-04-29 16:30:08.455
totalRecords0
tx.fileNameimplementation.xml
tx.flowread-input-files-sub-flow
txlineNumber71
worker0
x-transaction-idxxxx-e691-xx-91bf-xxx

As @gcusello suggested, you should use built-in JSON capability to do the job, not regex.  Use rex only to extract the JSON part.  Like this

 

| rex "eilog.EILog:\s*(?<eilog>{.+})"
| spath input=eilog
| spath input=jsonRecord

 

This is an emulation of your sample data.  Play with it and compare with real data.

 

| makeresults
| eval _raw = "INFO 2024-04-29 16:30:08,456 [[MuleRuntime].uber.05: [ct-fin-abc-apps-papi-v1-uw2-ut].abc-apps-schedular-main-flow.BLOCKING @68f82333] com.sfdc.it.ei.mule4.eilog.EILog: {\"worker\":\"0\",\"region\":\"us-ne-2\",\"applicationName\":\"ct-fin-abc-apps-papi-v1-uw2-ut\",\"applicationVersion\":\"1.0.7\",\"applicationType\":\"PAPI\",\"environment\":\"ct-app-UAT\",\"domain\":\"CR C4E\",\"x-transaction-id\":\"xxxx-e691-xx-91bf-xxx\",\"tx.flow\":\"read-input-files-sub-flow\",\"tx.fileName\":\"implementation.xml\",\"txlineNumber\":\"71\",\"stage\":\"MILESTONE\",\"status\":\"SUCCESS\",\"endpointSystem\":\"\",\"jsonRecord\":\"{\\n \\\"Task Name\\\": \\\"Cash Apps PAPI\\\",\\n \\\"Action Name\\\": \\\"Read Input Files GDrive Start\\\",\\n \\\"Run Reference\\\": \\\"xx-0645-11ef-xx-xx\\\",\\n \\\"Record Id Type\\\": \\\"Invoice\\\"\\n}\",\"detailText\":\"Start Reading Input Files from G drive\",\"businessRecordId\":\"\",\"businessRecordType\":\"\",\"batchSize\":\"0\",\"totalRecords\":\"0\",\"remainingRetries\":\"0\",\"timestamp\":\"2024-04-29 16:30:08.455\",\"threadName\":\"[MuleRuntime].uber.05: [ct-fin-aps-apps-papi-v1-uw2-ut].abc-apps-schedular-main-flow.BLOCKING @68f82333\"}"
``` data emulation above ```

 

Tags (1)
0 Karma

kranthimutyala2
Engager

@yuanliu this im able to extract but I need field values for Task Name, Action Name, DetailText etc

 

0 Karma

kranthimutyala2
Engager

@yuanliu @gcusello Im using rex field=event "{\\n \\"Task Name\\": \\"(?<taskName>[^\"]+)\\""
its working in Regex101 but not working in Splunk

0 Karma

yuanliu
SplunkTrust
SplunkTrust

A more fundamental problem is that by insisting  on using regex for this log, you are treating structured JSON log eilog.EILOG as text string.  It is NOT.   It is much more robust to use Splunk's built-in, QA tested capabilities to handle structured data.  Have you tried my suggestion

| rex "eilog.EILog:\s*(?<eilog>{.+})"
| spath input=eilog
| spath input=jsonRecord

and not getting all data fields in this JSON?   As I illustrated previously, this should give you

Task NameCash Apps PAPI

along with dozens of other key-value pairs.

0 Karma

kranthimutyala2
Engager

@yuanliu @gcusello it worked thanks

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @kranthimutyala2 ,

good for you, see next time!

Ciao and happy splunking

Giuseppe

P.S.: Karma Points are appreciated by all the contributors 😉

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @kranthimutyala2 ,

as also @yuanliu hinted, in Splunk you must use three backslashes instead 2 as in regex101:

| rex field=event "{\\\n \\\"Task Name\\\": \\"(?<taskName>[^\"]+)"

It's a difference: I opened a casefor a different behavios than the documentation, so the documentation was modified! I don't knw why, Splunk Project doesn't want to solve it!

Ciao.

Giuseppe

0 Karma

kranthimutyala2
Engager

@gcusello Thanks I tried but getting this error 
Error in 'SearchParser': Missing a search command before '^'. Error at position '461' of search query 'search index=abc source="http:clhub-preprod" "bt-f...{snipped} {errorcontext = taskName>[^\"]+)"}'.

0 Karma

kranthimutyala2
Engager

@gcusello tried but it didnt work

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @kranthimutyala2,

which sourcetype are you using?

did you tried json or _json?

In this case the INDEXED_EXTRACTIONS=json is enabled

Ciao.

Giuseppe

0 Karma

kranthimutyala2
Engager

@gcusello 

index=abc source="http:clhub-preprod" sourcetype=_json "ct-fin-abc-apps-papi-v1-uw2-ut" "Action Name" | rex field=event "^(?<event_type>\w+)" | where event_type="INFO" | spath input_field=event



event field contains the above log data

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @kranthimutyala2,

what does it happen using only spath:

index=abc source="http:clhub-preprod" sourcetype=_json "ct-fin-abc-apps-papi-v1-uw2-ut" "Action Name" 
| spath
| rex field=event "^(?<event_type>\w+)" 
| where event_type="INFO" 

Ciao.

giuseppe

0 Karma

kranthimutyala2
Engager

@gcusello I dont see any difference, its not extracting anything

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @kranthimutyala2,

did you tried to use the spath command (https://docs.splunk.com/Documentation/SplunkCloud/latest/SearchReference/Spath)?

It automatically recognize all the fields-value pairs.

Ciao.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...