Getting Data In

How to parse logs with a mix of JSON and non-JSON

khenson
Engager

Hi.  I have log source that has a mix of various field types and then a larger nested JSON payload.  I can't quite wrap my head around how to parse this out in our SplunkCloud environment.

High level, the log contains this:

  • date field
  • server name field (separated by four dashes most of the time, but some env have three)
  • process name[PID] 
  • source code function variable field ending with a colon char
  • source code function variable's value, which may or may not have special chars like () 
  • JSON 
  • ends with []

Sanitized example:

Oct 1 20:04:22 my-web01-aa-env my-web[14597]: app.NOTICE: Gateway Transaction Response (BP) {"response":"<Auth><Field1>ch-abcd1234-ab12-ab12-ab12-abcdef123456</Field1><Field2>0123</Field2><Field3>0123</Field3><Field4>Successful Request</Field4><responseMessage><Field5>0123</Field5><Field6>0123</Field6><Field7>0123</Field7><Field8>0123</Field8><Field9>0123</Field9><Field10>0123</Field10><Field11>0123</Field11><Field12>0123</Field12><Field13>012 - Approved (APPROVAL 001077)</Field13><Field14>Approved: 012345 (approval code)</Field14><Field15>00000</Field15><Field16>Address not available (Address not verified)</Field16><Field17>40</Field17><Field18></Field18><Field19>012345</Field19><Field20>20211001</Field20><Field21>0</Field21><Field22>840</Field22><Field23>USD</Field23><Field24>2021-10-01 16:04:21.493</Field24><Field25>2021-10-01 16:04:21.493</Field25><Field26>0123456</Field26><Field27>0123</Field27><Field28>0123456</Field28><Field29>0006</Field29><Field30>ABCDEF</Field30><Field31>Abc</Field31><Field32>ABC ABC ABC</Field32><Field33>Abcd</Field33><Field34>00</Field34><Field35>ABCDEF 012345 </Field35><Field36>ABCDEF0123</Field36><Field37>4F:A0000000041010;95:0000008000;9F10:0110A040002A0000000000000000000000FF;9B:E800;91:6325A37CFBC5CEDD0012;8A:</Field37><Field38>012345012345</Field38><Field39>abcd</Field38><Field39>False</Field39><Field40 /><Field40 /><Field41 /><Field42 /><Field42 /></Field43></Field43>"} []

 

My grok-fu is not great.  Would appreciate any suggestions.  Thanks!

Labels (1)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

I made a few other corrections/assumptions about your sanitised example:

| makeresults
| eval _raw="Oct 1 20:04:22 my-web01-aa-env my-web[14597]: app.NOTICE: Gateway Transaction Response (BP) {\"response\":\"<Auth><Field1>ch-abcd1234-ab12-ab12-ab12-abcdef123456</Field1><Field2>0123</Field2><Field3>0123</Field3><Field4>Successful Request</Field4><responseMessage><Field5>0123</Field5><Field6>0123</Field6><Field7>0123</Field7><Field8>0123</Field8><Field9>0123</Field9><Field10>0123</Field10><Field11>0123</Field11><Field12>0123</Field12><Field13>012 - Approved (APPROVAL 001077)</Field13><Field14>Approved: 012345 (approval code)</Field14><Field15>00000</Field15><Field16>Address not available (Address not verified)</Field16><Field17>40</Field17><Field18></Field18><Field19>012345</Field19><Field20>20211001</Field20><Field21>0</Field21><Field22>840</Field22><Field23>USD</Field23><Field24>2021-10-01 16:04:21.493</Field24><Field25>2021-10-01 16:04:21.493</Field25><Field26>0123456</Field26><Field27>0123</Field27><Field28>0123456</Field28><Field29>0006</Field29><Field30>ABCDEF</Field30><Field31>Abc</Field31><Field32>ABC ABC ABC</Field32><Field33>Abcd</Field33><Field34>00</Field34><Field35>ABCDEF 012345 </Field35><Field36>ABCDEF0123</Field36><Field37>4F:A0000000041010;95:0000008000;9F10:0110A040002A0000000000000000000000FF;9B:E800;91:6325A37CFBC5CEDD0012;8A:</Field37><Field38>012345012345</Field38><Field39>abcd</Field39><Field40>False</Field40><Field40/><Field41/><Field42/><Field43/><Field44/></responseMessage></Auth>\"} []"



| rex "(?<datetime>\w+\s+\d+\s+\d\d:\d\d:\d\d)\s(?<server>[^\s]+)\s(?<process>[^\[]+)\[(?<pid>[^\]]+)\]:\s(?<variablefield>[^:]+):\s(?<variablevalue>[^\{]+)\s(?<json>\{[^\}]+\})\s\[\]"
| spath input=json
| spath input=response

View solution in original post

0 Karma

khenson
Engager

Thank you very much for your time and sharing this.  I will start looking at how to incorporate this in my sourcetype in SplunkCloud.

0 Karma

khenson
Engager

I see an error in my sanitized log, the last </Field43> should have been </Auth>.  This seems pretty close, but the JSON didn't appear to show up as separate fields.  I was thinking that the JSON could be parsed into:
response.Auth.Field1

respoonse.Auth.Field2

Does that make sense?  

thanks!

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

I made a few other corrections/assumptions about your sanitised example:

| makeresults
| eval _raw="Oct 1 20:04:22 my-web01-aa-env my-web[14597]: app.NOTICE: Gateway Transaction Response (BP) {\"response\":\"<Auth><Field1>ch-abcd1234-ab12-ab12-ab12-abcdef123456</Field1><Field2>0123</Field2><Field3>0123</Field3><Field4>Successful Request</Field4><responseMessage><Field5>0123</Field5><Field6>0123</Field6><Field7>0123</Field7><Field8>0123</Field8><Field9>0123</Field9><Field10>0123</Field10><Field11>0123</Field11><Field12>0123</Field12><Field13>012 - Approved (APPROVAL 001077)</Field13><Field14>Approved: 012345 (approval code)</Field14><Field15>00000</Field15><Field16>Address not available (Address not verified)</Field16><Field17>40</Field17><Field18></Field18><Field19>012345</Field19><Field20>20211001</Field20><Field21>0</Field21><Field22>840</Field22><Field23>USD</Field23><Field24>2021-10-01 16:04:21.493</Field24><Field25>2021-10-01 16:04:21.493</Field25><Field26>0123456</Field26><Field27>0123</Field27><Field28>0123456</Field28><Field29>0006</Field29><Field30>ABCDEF</Field30><Field31>Abc</Field31><Field32>ABC ABC ABC</Field32><Field33>Abcd</Field33><Field34>00</Field34><Field35>ABCDEF 012345 </Field35><Field36>ABCDEF0123</Field36><Field37>4F:A0000000041010;95:0000008000;9F10:0110A040002A0000000000000000000000FF;9B:E800;91:6325A37CFBC5CEDD0012;8A:</Field37><Field38>012345012345</Field38><Field39>abcd</Field39><Field40>False</Field40><Field40/><Field41/><Field42/><Field43/><Field44/></responseMessage></Auth>\"} []"



| rex "(?<datetime>\w+\s+\d+\s+\d\d:\d\d:\d\d)\s(?<server>[^\s]+)\s(?<process>[^\[]+)\[(?<pid>[^\]]+)\]:\s(?<variablefield>[^:]+):\s(?<variablevalue>[^\{]+)\s(?<json>\{[^\}]+\})\s\[\]"
| spath input=json
| spath input=response
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust
| makeresults
| eval _raw="Oct 1 20:04:22 my-web01-aa-env my-web[14597]: app.NOTICE: Gateway Transaction Response (BP) {\"response\":\"<Auth><Field1>ch-abcd1234-ab12-ab12-ab12-abcdef123456</Field1><Field2>0123</Field2><Field3>0123</Field3><Field4>Successful Request</Field4><responseMessage><Field5>0123</Field5><Field6>0123</Field6><Field7>0123</Field7><Field8>0123</Field8><Field9>0123</Field9><Field10>0123</Field10><Field11>0123</Field11><Field12>0123</Field12><Field13>012 - Approved (APPROVAL 001077)</Field13><Field14>Approved: 012345 (approval code)</Field14><Field15>00000</Field15><Field16>Address not available (Address not verified)</Field16><Field17>40</Field17><Field18></Field18><Field19>012345</Field19><Field20>20211001</Field20><Field21>0</Field21><Field22>840</Field22><Field23>USD</Field23><Field24>2021-10-01 16:04:21.493</Field24><Field25>2021-10-01 16:04:21.493</Field25><Field26>0123456</Field26><Field27>0123</Field27><Field28>0123456</Field28><Field29>0006</Field29><Field30>ABCDEF</Field30><Field31>Abc</Field31><Field32>ABC ABC ABC</Field32><Field33>Abcd</Field33><Field34>00</Field34><Field35>ABCDEF 012345 </Field35><Field36>ABCDEF0123</Field36><Field37>4F:A0000000041010;95:0000008000;9F10:0110A040002A0000000000000000000000FF;9B:E800;91:6325A37CFBC5CEDD0012;8A:</Field37><Field38>012345012345</Field38><Field39>abcd</Field38><Field39>False</Field39><Field40 /><Field40 /><Field41 /><Field42 /><Field42 /></Field43></Field43>\"} []"



| rex "(?<datetime>\w+\s+\d+\s+\d\d:\d\d:\d\d)\s(?<server>[^\s]+)\s(?<process>[^\[]+)\[(?<pid>[^\]]+)\]:\s(?<variablefield>[^:]+):\s(?<variablevalue>[^\{]+)\s(?<json>\{[^\}]+\})\s\[\]"
0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Tiling

This puzzle (first published here) is based on finding groups of tessellated tiles (inspired by floor tiles I ...

SOK it to Me: Top 3 Benefits of Using Splunk Operator on Kubernetes that’ll Make ...

    Thursday, July 9, 2026  |  11:00AM–12:00PM PDT Duration: 1 hour (includes Q&A) Managing can feel like a ...

Upgrade Prep for 10.4, Network Observability Deep Dives, and More from Splunk Lantern

Splunk Lantern is Splunk’s customer success center that provides practical guidance from Splunk experts on key ...