I want to extract all the key value pairs from this event dynamically
Can someone help with the query
INFO 2024-04-29 16:30:08,456 [[MuleRuntime].uber.05: [ct-fin-abc-apps-papi-v1-uw2-ut].abc-apps-schedular-main-flow.BLOCKING @68f82333] com.sfdc.it.ei.mule4.eilog.EILog: {"worker":"0","region":"us-ne-2","applicationName":"ct-fin-abc-apps-papi-v1-uw2-ut","applicationVersion":"1.0.7","applicationType":"PAPI","environment":"ct-app-UAT","domain":"CR C4E","x-transaction-id":"xxxx-e691-xx-91bf-xxx","tx.flow":"read-input-files-sub-flow","tx.fileName":"implementation.xml","txlineNumber":"71","stage":"MILESTONE","status":"SUCCESS","endpointSystem":"","jsonRecord":"{\n \"Task Name\": \"Cash Apps PAPI\",\n \"Action Name\": \"Read Input Files GDrive Start\",\n \"Run Reference\": \"xx-0645-11ef-xx-xx\",\n \"Record Id Type\": \"Invoice\"\n}","detailText":"Start Reading Input Files from G drive","businessRecordId":"","businessRecordType":"","batchSize":"0","totalRecords":"0","remainingRetries":"0","timestamp":"2024-04-29 16:30:08.455","threadName":"[MuleRuntime].uber.05: [ct-fin-aps-apps-papi-v1-uw2-ut].abc-apps-schedular-main-flow.BLOCKING @68f82333"}
A more fundamental problem is that by insisting on using regex for this log, you are treating structured JSON log eilog.EILOG as text string. It is NOT. It is much more robust to use Splunk's built-in, QA tested capabilities to handle structured data. Have you tried my suggestion
| rex "eilog.EILog:\s*(?<eilog>{.+})"
| spath input=eilog
| spath input=jsonRecord
and not getting all data fields in this JSON? As I illustrated previously, this should give you
Task Name | Cash Apps PAPI |
along with dozens of other key-value pairs.
Do you mean to obtain something like this?
field name | field value |
Action Name | Read Input Files GDrive Start |
Record Id Type | Invoice |
Run Reference | xx-0645-11ef-xx-xx |
Task Name | Cash Apps PAPI |
applicationName | ct-fin-abc-apps-papi-v1-uw2-ut |
applicationType | PAPI |
applicationVersion | 1.0.7 |
batchSize | 0 |
businessRecordId | |
businessRecordType | |
detailText | Start Reading Input Files from G drive |
domain | CR C4E |
endpointSystem | |
environment | ct-app-UAT |
region | us-ne-2 |
remainingRetries | 0 |
stage | MILESTONE |
status | SUCCESS |
threadName | [MuleRuntime].uber.05: [ct-fin-aps-apps-papi-v1-uw2-ut].abc-apps-schedular-main-flow.BLOCKING @68f82333 |
timestamp | 2024-04-29 16:30:08.455 |
totalRecords | 0 |
tx.fileName | implementation.xml |
tx.flow | read-input-files-sub-flow |
txlineNumber | 71 |
worker | 0 |
x-transaction-id | xxxx-e691-xx-91bf-xxx |
As @gcusello suggested, you should use built-in JSON capability to do the job, not regex. Use rex only to extract the JSON part. Like this
| rex "eilog.EILog:\s*(?<eilog>{.+})"
| spath input=eilog
| spath input=jsonRecord
This is an emulation of your sample data. Play with it and compare with real data.
| makeresults
| eval _raw = "INFO 2024-04-29 16:30:08,456 [[MuleRuntime].uber.05: [ct-fin-abc-apps-papi-v1-uw2-ut].abc-apps-schedular-main-flow.BLOCKING @68f82333] com.sfdc.it.ei.mule4.eilog.EILog: {\"worker\":\"0\",\"region\":\"us-ne-2\",\"applicationName\":\"ct-fin-abc-apps-papi-v1-uw2-ut\",\"applicationVersion\":\"1.0.7\",\"applicationType\":\"PAPI\",\"environment\":\"ct-app-UAT\",\"domain\":\"CR C4E\",\"x-transaction-id\":\"xxxx-e691-xx-91bf-xxx\",\"tx.flow\":\"read-input-files-sub-flow\",\"tx.fileName\":\"implementation.xml\",\"txlineNumber\":\"71\",\"stage\":\"MILESTONE\",\"status\":\"SUCCESS\",\"endpointSystem\":\"\",\"jsonRecord\":\"{\\n \\\"Task Name\\\": \\\"Cash Apps PAPI\\\",\\n \\\"Action Name\\\": \\\"Read Input Files GDrive Start\\\",\\n \\\"Run Reference\\\": \\\"xx-0645-11ef-xx-xx\\\",\\n \\\"Record Id Type\\\": \\\"Invoice\\\"\\n}\",\"detailText\":\"Start Reading Input Files from G drive\",\"businessRecordId\":\"\",\"businessRecordType\":\"\",\"batchSize\":\"0\",\"totalRecords\":\"0\",\"remainingRetries\":\"0\",\"timestamp\":\"2024-04-29 16:30:08.455\",\"threadName\":\"[MuleRuntime].uber.05: [ct-fin-aps-apps-papi-v1-uw2-ut].abc-apps-schedular-main-flow.BLOCKING @68f82333\"}"
``` data emulation above ```
@yuanliu this im able to extract but I need field values for Task Name, Action Name, DetailText etc
A more fundamental problem is that by insisting on using regex for this log, you are treating structured JSON log eilog.EILOG as text string. It is NOT. It is much more robust to use Splunk's built-in, QA tested capabilities to handle structured data. Have you tried my suggestion
| rex "eilog.EILog:\s*(?<eilog>{.+})"
| spath input=eilog
| spath input=jsonRecord
and not getting all data fields in this JSON? As I illustrated previously, this should give you
Task Name | Cash Apps PAPI |
along with dozens of other key-value pairs.
Hi @kranthimutyala2 ,
good for you, see next time!
Ciao and happy splunking
Giuseppe
P.S.: Karma Points are appreciated by all the contributors 😉
Hi @kranthimutyala2 ,
as also @yuanliu hinted, in Splunk you must use three backslashes instead 2 as in regex101:
| rex field=event "{\\\n \\\"Task Name\\\": \\"(?<taskName>[^\"]+)"
It's a difference: I opened a casefor a different behavios than the documentation, so the documentation was modified! I don't knw why, Splunk Project doesn't want to solve it!
Ciao.
Giuseppe
@gcusello Thanks I tried but getting this error
Error in 'SearchParser': Missing a search command before '^'. Error at position '461' of search query 'search index=abc source="http:clhub-preprod" "bt-f...{snipped} {errorcontext = taskName>[^\"]+)"}'.
@gcusello tried but it didnt work
Hi @kranthimutyala2,
which sourcetype are you using?
did you tried json or _json?
In this case the INDEXED_EXTRACTIONS=json is enabled
Ciao.
Giuseppe
index=abc source="http:clhub-preprod" sourcetype=_json "ct-fin-abc-apps-papi-v1-uw2-ut" "Action Name" | rex field=event "^(?<event_type>\w+)" | where event_type="INFO" | spath input_field=event
event field contains the above log data
Hi @kranthimutyala2,
what does it happen using only spath:
index=abc source="http:clhub-preprod" sourcetype=_json "ct-fin-abc-apps-papi-v1-uw2-ut" "Action Name"
| spath
| rex field=event "^(?<event_type>\w+)"
| where event_type="INFO"
Ciao.
giuseppe
@gcusello I dont see any difference, its not extracting anything
Hi @kranthimutyala2,
did you tried to use the spath command (https://docs.splunk.com/Documentation/SplunkCloud/latest/SearchReference/Spath)?
It automatically recognize all the fields-value pairs.
Ciao.
Giuseppe