Getting Data In

How to parse logs with a mix of JSON and non-JSON

khenson
Engager

Hi.  I have log source that has a mix of various field types and then a larger nested JSON payload.  I can't quite wrap my head around how to parse this out in our SplunkCloud environment.

High level, the log contains this:

  • date field
  • server name field (separated by four dashes most of the time, but some env have three)
  • process name[PID] 
  • source code function variable field ending with a colon char
  • source code function variable's value, which may or may not have special chars like () 
  • JSON 
  • ends with []

Sanitized example:

Oct 1 20:04:22 my-web01-aa-env my-web[14597]: app.NOTICE: Gateway Transaction Response (BP) {"response":"<Auth><Field1>ch-abcd1234-ab12-ab12-ab12-abcdef123456</Field1><Field2>0123</Field2><Field3>0123</Field3><Field4>Successful Request</Field4><responseMessage><Field5>0123</Field5><Field6>0123</Field6><Field7>0123</Field7><Field8>0123</Field8><Field9>0123</Field9><Field10>0123</Field10><Field11>0123</Field11><Field12>0123</Field12><Field13>012 - Approved (APPROVAL 001077)</Field13><Field14>Approved: 012345 (approval code)</Field14><Field15>00000</Field15><Field16>Address not available (Address not verified)</Field16><Field17>40</Field17><Field18></Field18><Field19>012345</Field19><Field20>20211001</Field20><Field21>0</Field21><Field22>840</Field22><Field23>USD</Field23><Field24>2021-10-01 16:04:21.493</Field24><Field25>2021-10-01 16:04:21.493</Field25><Field26>0123456</Field26><Field27>0123</Field27><Field28>0123456</Field28><Field29>0006</Field29><Field30>ABCDEF</Field30><Field31>Abc</Field31><Field32>ABC ABC ABC</Field32><Field33>Abcd</Field33><Field34>00</Field34><Field35>ABCDEF 012345 </Field35><Field36>ABCDEF0123</Field36><Field37>4F:A0000000041010;95:0000008000;9F10:0110A040002A0000000000000000000000FF;9B:E800;91:6325A37CFBC5CEDD0012;8A:</Field37><Field38>012345012345</Field38><Field39>abcd</Field38><Field39>False</Field39><Field40 /><Field40 /><Field41 /><Field42 /><Field42 /></Field43></Field43>"} []

 

My grok-fu is not great.  Would appreciate any suggestions.  Thanks!

Labels (1)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

I made a few other corrections/assumptions about your sanitised example:

| makeresults
| eval _raw="Oct 1 20:04:22 my-web01-aa-env my-web[14597]: app.NOTICE: Gateway Transaction Response (BP) {\"response\":\"<Auth><Field1>ch-abcd1234-ab12-ab12-ab12-abcdef123456</Field1><Field2>0123</Field2><Field3>0123</Field3><Field4>Successful Request</Field4><responseMessage><Field5>0123</Field5><Field6>0123</Field6><Field7>0123</Field7><Field8>0123</Field8><Field9>0123</Field9><Field10>0123</Field10><Field11>0123</Field11><Field12>0123</Field12><Field13>012 - Approved (APPROVAL 001077)</Field13><Field14>Approved: 012345 (approval code)</Field14><Field15>00000</Field15><Field16>Address not available (Address not verified)</Field16><Field17>40</Field17><Field18></Field18><Field19>012345</Field19><Field20>20211001</Field20><Field21>0</Field21><Field22>840</Field22><Field23>USD</Field23><Field24>2021-10-01 16:04:21.493</Field24><Field25>2021-10-01 16:04:21.493</Field25><Field26>0123456</Field26><Field27>0123</Field27><Field28>0123456</Field28><Field29>0006</Field29><Field30>ABCDEF</Field30><Field31>Abc</Field31><Field32>ABC ABC ABC</Field32><Field33>Abcd</Field33><Field34>00</Field34><Field35>ABCDEF 012345 </Field35><Field36>ABCDEF0123</Field36><Field37>4F:A0000000041010;95:0000008000;9F10:0110A040002A0000000000000000000000FF;9B:E800;91:6325A37CFBC5CEDD0012;8A:</Field37><Field38>012345012345</Field38><Field39>abcd</Field39><Field40>False</Field40><Field40/><Field41/><Field42/><Field43/><Field44/></responseMessage></Auth>\"} []"



| rex "(?<datetime>\w+\s+\d+\s+\d\d:\d\d:\d\d)\s(?<server>[^\s]+)\s(?<process>[^\[]+)\[(?<pid>[^\]]+)\]:\s(?<variablefield>[^:]+):\s(?<variablevalue>[^\{]+)\s(?<json>\{[^\}]+\})\s\[\]"
| spath input=json
| spath input=response

View solution in original post

0 Karma

khenson
Engager

Thank you very much for your time and sharing this.  I will start looking at how to incorporate this in my sourcetype in SplunkCloud.

0 Karma

khenson
Engager

I see an error in my sanitized log, the last </Field43> should have been </Auth>.  This seems pretty close, but the JSON didn't appear to show up as separate fields.  I was thinking that the JSON could be parsed into:
response.Auth.Field1

respoonse.Auth.Field2

Does that make sense?  

thanks!

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

I made a few other corrections/assumptions about your sanitised example:

| makeresults
| eval _raw="Oct 1 20:04:22 my-web01-aa-env my-web[14597]: app.NOTICE: Gateway Transaction Response (BP) {\"response\":\"<Auth><Field1>ch-abcd1234-ab12-ab12-ab12-abcdef123456</Field1><Field2>0123</Field2><Field3>0123</Field3><Field4>Successful Request</Field4><responseMessage><Field5>0123</Field5><Field6>0123</Field6><Field7>0123</Field7><Field8>0123</Field8><Field9>0123</Field9><Field10>0123</Field10><Field11>0123</Field11><Field12>0123</Field12><Field13>012 - Approved (APPROVAL 001077)</Field13><Field14>Approved: 012345 (approval code)</Field14><Field15>00000</Field15><Field16>Address not available (Address not verified)</Field16><Field17>40</Field17><Field18></Field18><Field19>012345</Field19><Field20>20211001</Field20><Field21>0</Field21><Field22>840</Field22><Field23>USD</Field23><Field24>2021-10-01 16:04:21.493</Field24><Field25>2021-10-01 16:04:21.493</Field25><Field26>0123456</Field26><Field27>0123</Field27><Field28>0123456</Field28><Field29>0006</Field29><Field30>ABCDEF</Field30><Field31>Abc</Field31><Field32>ABC ABC ABC</Field32><Field33>Abcd</Field33><Field34>00</Field34><Field35>ABCDEF 012345 </Field35><Field36>ABCDEF0123</Field36><Field37>4F:A0000000041010;95:0000008000;9F10:0110A040002A0000000000000000000000FF;9B:E800;91:6325A37CFBC5CEDD0012;8A:</Field37><Field38>012345012345</Field38><Field39>abcd</Field39><Field40>False</Field40><Field40/><Field41/><Field42/><Field43/><Field44/></responseMessage></Auth>\"} []"



| rex "(?<datetime>\w+\s+\d+\s+\d\d:\d\d:\d\d)\s(?<server>[^\s]+)\s(?<process>[^\[]+)\[(?<pid>[^\]]+)\]:\s(?<variablefield>[^:]+):\s(?<variablevalue>[^\{]+)\s(?<json>\{[^\}]+\})\s\[\]"
| spath input=json
| spath input=response
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust
| makeresults
| eval _raw="Oct 1 20:04:22 my-web01-aa-env my-web[14597]: app.NOTICE: Gateway Transaction Response (BP) {\"response\":\"<Auth><Field1>ch-abcd1234-ab12-ab12-ab12-abcdef123456</Field1><Field2>0123</Field2><Field3>0123</Field3><Field4>Successful Request</Field4><responseMessage><Field5>0123</Field5><Field6>0123</Field6><Field7>0123</Field7><Field8>0123</Field8><Field9>0123</Field9><Field10>0123</Field10><Field11>0123</Field11><Field12>0123</Field12><Field13>012 - Approved (APPROVAL 001077)</Field13><Field14>Approved: 012345 (approval code)</Field14><Field15>00000</Field15><Field16>Address not available (Address not verified)</Field16><Field17>40</Field17><Field18></Field18><Field19>012345</Field19><Field20>20211001</Field20><Field21>0</Field21><Field22>840</Field22><Field23>USD</Field23><Field24>2021-10-01 16:04:21.493</Field24><Field25>2021-10-01 16:04:21.493</Field25><Field26>0123456</Field26><Field27>0123</Field27><Field28>0123456</Field28><Field29>0006</Field29><Field30>ABCDEF</Field30><Field31>Abc</Field31><Field32>ABC ABC ABC</Field32><Field33>Abcd</Field33><Field34>00</Field34><Field35>ABCDEF 012345 </Field35><Field36>ABCDEF0123</Field36><Field37>4F:A0000000041010;95:0000008000;9F10:0110A040002A0000000000000000000000FF;9B:E800;91:6325A37CFBC5CEDD0012;8A:</Field37><Field38>012345012345</Field38><Field39>abcd</Field38><Field39>False</Field39><Field40 /><Field40 /><Field41 /><Field42 /><Field42 /></Field43></Field43>\"} []"



| rex "(?<datetime>\w+\s+\d+\s+\d\d:\d\d:\d\d)\s(?<server>[^\s]+)\s(?<process>[^\[]+)\[(?<pid>[^\]]+)\]:\s(?<variablefield>[^:]+):\s(?<variablevalue>[^\{]+)\s(?<json>\{[^\}]+\})\s\[\]"
0 Karma
Get Updates on the Splunk Community!

Splunk Education - Fast Start Program!

Welcome to Splunk Education! Splunk training programs are designed to enable you to get started quickly and ...

Five Subtly Different Ways of Adding Manual Instrumentation in Java

You can find the code of this example on GitHub here. Please feel free to star the repository to keep in ...

New Splunk APM Enhancements Help Troubleshoot Your MySQL and NoSQL Databases Faster

Splunk Observability has two new enhancements to make it quicker and easier to troubleshoot slow or frequently ...