Getting Data In

How to achieve auto field extraction of nested JSON with non-JSON in front?

johnansett
Communicator

Hello!  

We have some logs coming across which are in JSON and thus 'just work'. The problem is, inside the log field are the events we need to extract.  There are about 200 app's that will be logging this way and each app will have different fields and values so doing a standard field extract won't work or would be 1000's of potential kvps. The format however is always the same.

 

 

{
  "log": "[2022-08-25 18:54:40.031] INFO  JsonLogger [[MuleRuntime].uber.143312: [prd-ops-mulesoft].encryptFlow.BLOCKING @4ac358d5] [event: 25670349-e6b5-4996-9cb6-c4c9657cd9ba]: {\n  \"correlationId\" : \"25670349-e6b5-4996-9cb6-c4c9657cd9ba\",\n  \"message\" : \"MESSAGE_HERE\",\n  \"tracePoint\" : \"END\",\n  \"priority\" : \"INFO\",\n  \"elapsed\" : 0,\n  \"locationInfo\" : {\n    \"lineInFile\" : \"95\",\n    \"component\" : \"json-logger:logger\",\n    \"fileName\" : \"buildLoggingAndResponse.xml\",\n    \"rootContainer\" : \"responseStatus_success\"\n  },\n  \"timestamp\" : \"2022-08-25T18:54:40.030Z\",\n  \"content\" : {\n    \"ResponseStatus\" : {\n      \"type\" : \"SUCCESS\",\n      \"title\" : \"Encryption successful.\",\n      \"status\" : \"200\",\n      \"detail\" : { },\n      \"correlationId\" : \"25670349-e6b5-4996-9cb6-c4c9657cd9ba\",\n      \"apiMethodName\" : null,\n      \"apiURL\" : \"https://app.com/prd-ops-mulesoft/encrypt/v1.0\",\n      \"apiVersion\" : \"v1\",\n      \"x-ConsumerRequestSentTimeStamp\" : \"\",\n      \"apiRequestReceivedTimeStamp\" : \"2022-08-25T18:54:39.856Z\",\n      \"apiResponseSentTimeStamp\" : \"2022-08-25T18:54:40.031Z\",\n      \"userId\" : \"GID01350\",\n      \"orchestrations\" : [ ]\n    }\n  },\n  \"applicationName\" : \"ops-mulesoft\",\n  \"applicationVersion\" : \"v1\",\n  \"environment\" : \"PRD\",\n  \"threadName\" : \"[MuleRuntime].uber.143312: [prd-ops-mulesoft].encryptFlow.BLOCKING @4ac358d5\"\n}\n",
  "stream": "stdout",
  "time": "2022-08-25T18:54:40.086450071Z",
  "kubernetes": {
    "pod_name": "prd-ops-mulesoft-94c49bdff-pcb5n",
    "namespace_name": "4cfa0f08-92b0-467b-9ca4-9e49083fd922",
    "pod_id": "e9046b5e-0d70-11ed-9db5-0050569b19f6",
    "labels": {
      "am-org-id": "01a4664d-9e16-454b-a14c-59548ef896b5",
      "app": "prd-ops-mulesoft",
      "environment": "4cfa0f08-92b0-467b-9ca4-9e49083fd922",
      "master-org-id": "01a4664d-9e16-454b-a14c-59548ef896b5",
      "organization": "01a4664d-9e16-454b-a14c-59548ef896b5",
      "pod-template-hash": "94c49bdff",
      "rtf.mulesoft.com/generation": "aab6b8074cf73151b1515de0e468478e",
      "rtf.mulesoft.com/id": "18d3e5d6-ce59-4837-9f3b-8aad3ccffcef",
      "type": "MuleApplication"
    },
    "host": "1.1.1.1",
    "container_name": "app",
    "docker_id": "cf07f321aec551b200fb3f31f6f1c67b2678ff6f6a335d4ca41ec2565770513c",
    "container_hash": "rtf-runtime-registry.kprod.msap.io/mulesoft/poseidon-runtime-4.3.0@sha256:6cfeb965e0ff7671778bc53a54a05d8180d4522f0b1ef7bb25e674686b8c3b75",
    "container_image": "rtf-runtime-registry.kprod.msap.io/mulesoft/poseidon-runtime-4.3.0:20211222-2"
  }
}

 

 

 

The JSON works fine, but the events we ALSO want extracted are in here 

 

 

{\n  \"correlationId\" : \"25670349-e6b5-4996-9cb6-c4c9657cd9ba\",\n  \"message\" : \"MESSAGE_HERE\",\n  \"tracePoint\" : \"END\",\n  \"priority\" : \"INFO\",\n  \"elapsed\" : 0,\n  \"locationInfo\" : {\n    \"lineInFile\" : \"95\",\n    \"component\" : \"json-logger:logger\",\n    \"fileName\" : \"buildLoggingAndResponse.xml\",\n    \"rootContainer\" : \"responseStatus_success\"\n  },\n  \"timestamp\" : \"2022-08-25T18:54:40.030Z\",\n  \"content\" : {\n    \"ResponseStatus\" : {\n      \"type\" : \"SUCCESS\",\n      \"title\" : \"Encryption successful.\",\n      \"status\" : \"200\",\n      \"detail\" : { },\n      \"correlationId\" : \"25670349-e6b5-4996-9cb6-c4c9657cd9ba\",\n      \"apiMethodName\" : null,\n      \"apiURL\" : \"https://app.com/prd-ops-mulesoft/encrypt/v1.0\",\n      \"apiVersion\" : \"v1\",\n      \"x-ConsumerRequestSentTimeStamp\" : \"\",\n      \"apiRequestReceivedTimeStamp\" : \"2022-08-25T18:54:39.856Z\",\n      \"apiResponseSentTimeStamp\" : \"2022-08-25T18:54:40.031Z\",\n      \"userId\" : \"GID01350\",\n      \"orchestrations\" : [ ]\n    }\n  },\n  \"applicationName\" : \"ops-mulesoft\",\n  \"applicationVersion\" : \"v1\",\n  \"environment\" : \"PRD\",\n  \"threadName\" : \"[MuleRuntime].uber.143312: [prd-ops-mulesoft].encryptFlow.BLOCKING @4ac358d5\"\n}

 

 


SPATH would work, but the JSON is fronted by this

 

 

"[2022-08-25 18:54:40.031] INFO  JsonLogger [[MuleRuntime].uber.143312: [prd-ops-mulesoft].encryptFlow.BLOCKING @4ac358d5] [event: 25670349-e6b5-4996-9cb6-c4c9657cd9ba]:

 

 

So it doesn't treat it as JSON.

This is one example, others events don't have the correlationID, etc. 

We need a method that will take the raw data, parse it as JSON AND then dynamically extract the events in the log field in their KV pairs (e.g. \"correlationId\" : \"25670349-e6b5-4996-9cb6-c4c9657cd9ba\" == $1::$2)

Can this be done in transforms using regex?

Is this even possible, or do we ultimately need to create extractions based on every possible field?

 

Appreciate the guidance here!

Thanks!!

Labels (2)
Tags (1)
0 Karma
1 Solution

kamlesh_vaghela
SplunkTrust
SplunkTrust

@johnansett 

Can you please try this?

YOUR_SEARCH
| rex field=log "(?<log_json>\{[^$]*)" 
| spath input=log_json
| table log_json correlationId kubernetes.container_hash content.ResponseStatus.type

 

Screenshot 2022-08-28 at 9.42.54 AM.png

 

I hope this will help you.

Thanks
KV
If any of my replies help you to solve the problem Or gain knowledge, an upvote would be appreciated.

 

View solution in original post

kamlesh_vaghela
SplunkTrust
SplunkTrust

@johnansett 

Can you please try this?

YOUR_SEARCH
| rex field=log "(?<log_json>\{[^$]*)" 
| spath input=log_json
| table log_json correlationId kubernetes.container_hash content.ResponseStatus.type

 

Screenshot 2022-08-28 at 9.42.54 AM.png

 

I hope this will help you.

Thanks
KV
If any of my replies help you to solve the problem Or gain knowledge, an upvote would be appreciated.

 

johnansett
Communicator

Yep, that has worked!!

Thanks!

0 Karma
Get Updates on the Splunk Community!

Splunk Training for All: Meet Aspiring Cybersecurity Analyst, Marc Alicea

Splunk Education believes in the value of training and certification in today’s rapidly-changing data-driven ...

Investigate Security and Threat Detection with VirusTotal and Splunk Integration

As security threats and their complexities surge, security analysts deal with increased challenges and ...

Observability Highlights | January 2023 Newsletter

 January 2023New Product Releases Splunk Network Explorer for Infrastructure MonitoringSplunk unveils Network ...