Getting Data In

How to parse logs with a mix of JSON and non-JSON

khenson
Engager

Hi.  I have log source that has a mix of various field types and then a larger nested JSON payload.  I can't quite wrap my head around how to parse this out in our SplunkCloud environment.

High level, the log contains this:

  • date field
  • server name field (separated by four dashes most of the time, but some env have three)
  • process name[PID] 
  • source code function variable field ending with a colon char
  • source code function variable's value, which may or may not have special chars like () 
  • JSON 
  • ends with []

Sanitized example:

Oct 1 20:04:22 my-web01-aa-env my-web[14597]: app.NOTICE: Gateway Transaction Response (BP) {"response":"<Auth><Field1>ch-abcd1234-ab12-ab12-ab12-abcdef123456</Field1><Field2>0123</Field2><Field3>0123</Field3><Field4>Successful Request</Field4><responseMessage><Field5>0123</Field5><Field6>0123</Field6><Field7>0123</Field7><Field8>0123</Field8><Field9>0123</Field9><Field10>0123</Field10><Field11>0123</Field11><Field12>0123</Field12><Field13>012 - Approved (APPROVAL 001077)</Field13><Field14>Approved: 012345 (approval code)</Field14><Field15>00000</Field15><Field16>Address not available (Address not verified)</Field16><Field17>40</Field17><Field18></Field18><Field19>012345</Field19><Field20>20211001</Field20><Field21>0</Field21><Field22>840</Field22><Field23>USD</Field23><Field24>2021-10-01 16:04:21.493</Field24><Field25>2021-10-01 16:04:21.493</Field25><Field26>0123456</Field26><Field27>0123</Field27><Field28>0123456</Field28><Field29>0006</Field29><Field30>ABCDEF</Field30><Field31>Abc</Field31><Field32>ABC ABC ABC</Field32><Field33>Abcd</Field33><Field34>00</Field34><Field35>ABCDEF 012345 </Field35><Field36>ABCDEF0123</Field36><Field37>4F:A0000000041010;95:0000008000;9F10:0110A040002A0000000000000000000000FF;9B:E800;91:6325A37CFBC5CEDD0012;8A:</Field37><Field38>012345012345</Field38><Field39>abcd</Field38><Field39>False</Field39><Field40 /><Field40 /><Field41 /><Field42 /><Field42 /></Field43></Field43>"} []

 

My grok-fu is not great.  Would appreciate any suggestions.  Thanks!

Labels (1)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

I made a few other corrections/assumptions about your sanitised example:

| makeresults
| eval _raw="Oct 1 20:04:22 my-web01-aa-env my-web[14597]: app.NOTICE: Gateway Transaction Response (BP) {\"response\":\"<Auth><Field1>ch-abcd1234-ab12-ab12-ab12-abcdef123456</Field1><Field2>0123</Field2><Field3>0123</Field3><Field4>Successful Request</Field4><responseMessage><Field5>0123</Field5><Field6>0123</Field6><Field7>0123</Field7><Field8>0123</Field8><Field9>0123</Field9><Field10>0123</Field10><Field11>0123</Field11><Field12>0123</Field12><Field13>012 - Approved (APPROVAL 001077)</Field13><Field14>Approved: 012345 (approval code)</Field14><Field15>00000</Field15><Field16>Address not available (Address not verified)</Field16><Field17>40</Field17><Field18></Field18><Field19>012345</Field19><Field20>20211001</Field20><Field21>0</Field21><Field22>840</Field22><Field23>USD</Field23><Field24>2021-10-01 16:04:21.493</Field24><Field25>2021-10-01 16:04:21.493</Field25><Field26>0123456</Field26><Field27>0123</Field27><Field28>0123456</Field28><Field29>0006</Field29><Field30>ABCDEF</Field30><Field31>Abc</Field31><Field32>ABC ABC ABC</Field32><Field33>Abcd</Field33><Field34>00</Field34><Field35>ABCDEF 012345 </Field35><Field36>ABCDEF0123</Field36><Field37>4F:A0000000041010;95:0000008000;9F10:0110A040002A0000000000000000000000FF;9B:E800;91:6325A37CFBC5CEDD0012;8A:</Field37><Field38>012345012345</Field38><Field39>abcd</Field39><Field40>False</Field40><Field40/><Field41/><Field42/><Field43/><Field44/></responseMessage></Auth>\"} []"



| rex "(?<datetime>\w+\s+\d+\s+\d\d:\d\d:\d\d)\s(?<server>[^\s]+)\s(?<process>[^\[]+)\[(?<pid>[^\]]+)\]:\s(?<variablefield>[^:]+):\s(?<variablevalue>[^\{]+)\s(?<json>\{[^\}]+\})\s\[\]"
| spath input=json
| spath input=response

View solution in original post

0 Karma

khenson
Engager

Thank you very much for your time and sharing this.  I will start looking at how to incorporate this in my sourcetype in SplunkCloud.

0 Karma

khenson
Engager

I see an error in my sanitized log, the last </Field43> should have been </Auth>.  This seems pretty close, but the JSON didn't appear to show up as separate fields.  I was thinking that the JSON could be parsed into:
response.Auth.Field1

respoonse.Auth.Field2

Does that make sense?  

thanks!

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

I made a few other corrections/assumptions about your sanitised example:

| makeresults
| eval _raw="Oct 1 20:04:22 my-web01-aa-env my-web[14597]: app.NOTICE: Gateway Transaction Response (BP) {\"response\":\"<Auth><Field1>ch-abcd1234-ab12-ab12-ab12-abcdef123456</Field1><Field2>0123</Field2><Field3>0123</Field3><Field4>Successful Request</Field4><responseMessage><Field5>0123</Field5><Field6>0123</Field6><Field7>0123</Field7><Field8>0123</Field8><Field9>0123</Field9><Field10>0123</Field10><Field11>0123</Field11><Field12>0123</Field12><Field13>012 - Approved (APPROVAL 001077)</Field13><Field14>Approved: 012345 (approval code)</Field14><Field15>00000</Field15><Field16>Address not available (Address not verified)</Field16><Field17>40</Field17><Field18></Field18><Field19>012345</Field19><Field20>20211001</Field20><Field21>0</Field21><Field22>840</Field22><Field23>USD</Field23><Field24>2021-10-01 16:04:21.493</Field24><Field25>2021-10-01 16:04:21.493</Field25><Field26>0123456</Field26><Field27>0123</Field27><Field28>0123456</Field28><Field29>0006</Field29><Field30>ABCDEF</Field30><Field31>Abc</Field31><Field32>ABC ABC ABC</Field32><Field33>Abcd</Field33><Field34>00</Field34><Field35>ABCDEF 012345 </Field35><Field36>ABCDEF0123</Field36><Field37>4F:A0000000041010;95:0000008000;9F10:0110A040002A0000000000000000000000FF;9B:E800;91:6325A37CFBC5CEDD0012;8A:</Field37><Field38>012345012345</Field38><Field39>abcd</Field39><Field40>False</Field40><Field40/><Field41/><Field42/><Field43/><Field44/></responseMessage></Auth>\"} []"



| rex "(?<datetime>\w+\s+\d+\s+\d\d:\d\d:\d\d)\s(?<server>[^\s]+)\s(?<process>[^\[]+)\[(?<pid>[^\]]+)\]:\s(?<variablefield>[^:]+):\s(?<variablevalue>[^\{]+)\s(?<json>\{[^\}]+\})\s\[\]"
| spath input=json
| spath input=response
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust
| makeresults
| eval _raw="Oct 1 20:04:22 my-web01-aa-env my-web[14597]: app.NOTICE: Gateway Transaction Response (BP) {\"response\":\"<Auth><Field1>ch-abcd1234-ab12-ab12-ab12-abcdef123456</Field1><Field2>0123</Field2><Field3>0123</Field3><Field4>Successful Request</Field4><responseMessage><Field5>0123</Field5><Field6>0123</Field6><Field7>0123</Field7><Field8>0123</Field8><Field9>0123</Field9><Field10>0123</Field10><Field11>0123</Field11><Field12>0123</Field12><Field13>012 - Approved (APPROVAL 001077)</Field13><Field14>Approved: 012345 (approval code)</Field14><Field15>00000</Field15><Field16>Address not available (Address not verified)</Field16><Field17>40</Field17><Field18></Field18><Field19>012345</Field19><Field20>20211001</Field20><Field21>0</Field21><Field22>840</Field22><Field23>USD</Field23><Field24>2021-10-01 16:04:21.493</Field24><Field25>2021-10-01 16:04:21.493</Field25><Field26>0123456</Field26><Field27>0123</Field27><Field28>0123456</Field28><Field29>0006</Field29><Field30>ABCDEF</Field30><Field31>Abc</Field31><Field32>ABC ABC ABC</Field32><Field33>Abcd</Field33><Field34>00</Field34><Field35>ABCDEF 012345 </Field35><Field36>ABCDEF0123</Field36><Field37>4F:A0000000041010;95:0000008000;9F10:0110A040002A0000000000000000000000FF;9B:E800;91:6325A37CFBC5CEDD0012;8A:</Field37><Field38>012345012345</Field38><Field39>abcd</Field38><Field39>False</Field39><Field40 /><Field40 /><Field41 /><Field42 /><Field42 /></Field43></Field43>\"} []"



| rex "(?<datetime>\w+\s+\d+\s+\d\d:\d\d:\d\d)\s(?<server>[^\s]+)\s(?<process>[^\[]+)\[(?<pid>[^\]]+)\]:\s(?<variablefield>[^:]+):\s(?<variablevalue>[^\{]+)\s(?<json>\{[^\}]+\})\s\[\]"
0 Karma
Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

March Community Office Hours Security Series Uncovered!

Hello Splunk Community! In March, Splunk Community Office Hours spotlighted our fabulous Splunk Threat ...

Stay Connected: Your Guide to April Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars in April. This post ...