Getting Data In

How to parse logs with a mix of JSON and non-JSON

khenson
Engager

Hi.  I have log source that has a mix of various field types and then a larger nested JSON payload.  I can't quite wrap my head around how to parse this out in our SplunkCloud environment.

High level, the log contains this:

  • date field
  • server name field (separated by four dashes most of the time, but some env have three)
  • process name[PID] 
  • source code function variable field ending with a colon char
  • source code function variable's value, which may or may not have special chars like () 
  • JSON 
  • ends with []

Sanitized example:

Oct 1 20:04:22 my-web01-aa-env my-web[14597]: app.NOTICE: Gateway Transaction Response (BP) {"response":"<Auth><Field1>ch-abcd1234-ab12-ab12-ab12-abcdef123456</Field1><Field2>0123</Field2><Field3>0123</Field3><Field4>Successful Request</Field4><responseMessage><Field5>0123</Field5><Field6>0123</Field6><Field7>0123</Field7><Field8>0123</Field8><Field9>0123</Field9><Field10>0123</Field10><Field11>0123</Field11><Field12>0123</Field12><Field13>012 - Approved (APPROVAL 001077)</Field13><Field14>Approved: 012345 (approval code)</Field14><Field15>00000</Field15><Field16>Address not available (Address not verified)</Field16><Field17>40</Field17><Field18></Field18><Field19>012345</Field19><Field20>20211001</Field20><Field21>0</Field21><Field22>840</Field22><Field23>USD</Field23><Field24>2021-10-01 16:04:21.493</Field24><Field25>2021-10-01 16:04:21.493</Field25><Field26>0123456</Field26><Field27>0123</Field27><Field28>0123456</Field28><Field29>0006</Field29><Field30>ABCDEF</Field30><Field31>Abc</Field31><Field32>ABC ABC ABC</Field32><Field33>Abcd</Field33><Field34>00</Field34><Field35>ABCDEF 012345 </Field35><Field36>ABCDEF0123</Field36><Field37>4F:A0000000041010;95:0000008000;9F10:0110A040002A0000000000000000000000FF;9B:E800;91:6325A37CFBC5CEDD0012;8A:</Field37><Field38>012345012345</Field38><Field39>abcd</Field38><Field39>False</Field39><Field40 /><Field40 /><Field41 /><Field42 /><Field42 /></Field43></Field43>"} []

 

My grok-fu is not great.  Would appreciate any suggestions.  Thanks!

Labels (1)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

I made a few other corrections/assumptions about your sanitised example:

| makeresults
| eval _raw="Oct 1 20:04:22 my-web01-aa-env my-web[14597]: app.NOTICE: Gateway Transaction Response (BP) {\"response\":\"<Auth><Field1>ch-abcd1234-ab12-ab12-ab12-abcdef123456</Field1><Field2>0123</Field2><Field3>0123</Field3><Field4>Successful Request</Field4><responseMessage><Field5>0123</Field5><Field6>0123</Field6><Field7>0123</Field7><Field8>0123</Field8><Field9>0123</Field9><Field10>0123</Field10><Field11>0123</Field11><Field12>0123</Field12><Field13>012 - Approved (APPROVAL 001077)</Field13><Field14>Approved: 012345 (approval code)</Field14><Field15>00000</Field15><Field16>Address not available (Address not verified)</Field16><Field17>40</Field17><Field18></Field18><Field19>012345</Field19><Field20>20211001</Field20><Field21>0</Field21><Field22>840</Field22><Field23>USD</Field23><Field24>2021-10-01 16:04:21.493</Field24><Field25>2021-10-01 16:04:21.493</Field25><Field26>0123456</Field26><Field27>0123</Field27><Field28>0123456</Field28><Field29>0006</Field29><Field30>ABCDEF</Field30><Field31>Abc</Field31><Field32>ABC ABC ABC</Field32><Field33>Abcd</Field33><Field34>00</Field34><Field35>ABCDEF 012345 </Field35><Field36>ABCDEF0123</Field36><Field37>4F:A0000000041010;95:0000008000;9F10:0110A040002A0000000000000000000000FF;9B:E800;91:6325A37CFBC5CEDD0012;8A:</Field37><Field38>012345012345</Field38><Field39>abcd</Field39><Field40>False</Field40><Field40/><Field41/><Field42/><Field43/><Field44/></responseMessage></Auth>\"} []"



| rex "(?<datetime>\w+\s+\d+\s+\d\d:\d\d:\d\d)\s(?<server>[^\s]+)\s(?<process>[^\[]+)\[(?<pid>[^\]]+)\]:\s(?<variablefield>[^:]+):\s(?<variablevalue>[^\{]+)\s(?<json>\{[^\}]+\})\s\[\]"
| spath input=json
| spath input=response

View solution in original post

0 Karma

khenson
Engager

Thank you very much for your time and sharing this.  I will start looking at how to incorporate this in my sourcetype in SplunkCloud.

0 Karma

khenson
Engager

I see an error in my sanitized log, the last </Field43> should have been </Auth>.  This seems pretty close, but the JSON didn't appear to show up as separate fields.  I was thinking that the JSON could be parsed into:
response.Auth.Field1

respoonse.Auth.Field2

Does that make sense?  

thanks!

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

I made a few other corrections/assumptions about your sanitised example:

| makeresults
| eval _raw="Oct 1 20:04:22 my-web01-aa-env my-web[14597]: app.NOTICE: Gateway Transaction Response (BP) {\"response\":\"<Auth><Field1>ch-abcd1234-ab12-ab12-ab12-abcdef123456</Field1><Field2>0123</Field2><Field3>0123</Field3><Field4>Successful Request</Field4><responseMessage><Field5>0123</Field5><Field6>0123</Field6><Field7>0123</Field7><Field8>0123</Field8><Field9>0123</Field9><Field10>0123</Field10><Field11>0123</Field11><Field12>0123</Field12><Field13>012 - Approved (APPROVAL 001077)</Field13><Field14>Approved: 012345 (approval code)</Field14><Field15>00000</Field15><Field16>Address not available (Address not verified)</Field16><Field17>40</Field17><Field18></Field18><Field19>012345</Field19><Field20>20211001</Field20><Field21>0</Field21><Field22>840</Field22><Field23>USD</Field23><Field24>2021-10-01 16:04:21.493</Field24><Field25>2021-10-01 16:04:21.493</Field25><Field26>0123456</Field26><Field27>0123</Field27><Field28>0123456</Field28><Field29>0006</Field29><Field30>ABCDEF</Field30><Field31>Abc</Field31><Field32>ABC ABC ABC</Field32><Field33>Abcd</Field33><Field34>00</Field34><Field35>ABCDEF 012345 </Field35><Field36>ABCDEF0123</Field36><Field37>4F:A0000000041010;95:0000008000;9F10:0110A040002A0000000000000000000000FF;9B:E800;91:6325A37CFBC5CEDD0012;8A:</Field37><Field38>012345012345</Field38><Field39>abcd</Field39><Field40>False</Field40><Field40/><Field41/><Field42/><Field43/><Field44/></responseMessage></Auth>\"} []"



| rex "(?<datetime>\w+\s+\d+\s+\d\d:\d\d:\d\d)\s(?<server>[^\s]+)\s(?<process>[^\[]+)\[(?<pid>[^\]]+)\]:\s(?<variablefield>[^:]+):\s(?<variablevalue>[^\{]+)\s(?<json>\{[^\}]+\})\s\[\]"
| spath input=json
| spath input=response
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust
| makeresults
| eval _raw="Oct 1 20:04:22 my-web01-aa-env my-web[14597]: app.NOTICE: Gateway Transaction Response (BP) {\"response\":\"<Auth><Field1>ch-abcd1234-ab12-ab12-ab12-abcdef123456</Field1><Field2>0123</Field2><Field3>0123</Field3><Field4>Successful Request</Field4><responseMessage><Field5>0123</Field5><Field6>0123</Field6><Field7>0123</Field7><Field8>0123</Field8><Field9>0123</Field9><Field10>0123</Field10><Field11>0123</Field11><Field12>0123</Field12><Field13>012 - Approved (APPROVAL 001077)</Field13><Field14>Approved: 012345 (approval code)</Field14><Field15>00000</Field15><Field16>Address not available (Address not verified)</Field16><Field17>40</Field17><Field18></Field18><Field19>012345</Field19><Field20>20211001</Field20><Field21>0</Field21><Field22>840</Field22><Field23>USD</Field23><Field24>2021-10-01 16:04:21.493</Field24><Field25>2021-10-01 16:04:21.493</Field25><Field26>0123456</Field26><Field27>0123</Field27><Field28>0123456</Field28><Field29>0006</Field29><Field30>ABCDEF</Field30><Field31>Abc</Field31><Field32>ABC ABC ABC</Field32><Field33>Abcd</Field33><Field34>00</Field34><Field35>ABCDEF 012345 </Field35><Field36>ABCDEF0123</Field36><Field37>4F:A0000000041010;95:0000008000;9F10:0110A040002A0000000000000000000000FF;9B:E800;91:6325A37CFBC5CEDD0012;8A:</Field37><Field38>012345012345</Field38><Field39>abcd</Field38><Field39>False</Field39><Field40 /><Field40 /><Field41 /><Field42 /><Field42 /></Field43></Field43>\"} []"



| rex "(?<datetime>\w+\s+\d+\s+\d\d:\d\d:\d\d)\s(?<server>[^\s]+)\s(?<process>[^\[]+)\[(?<pid>[^\]]+)\]:\s(?<variablefield>[^:]+):\s(?<variablevalue>[^\{]+)\s(?<json>\{[^\}]+\})\s\[\]"
0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...