Getting Data In

How to parse logs with a mix of JSON and non-JSON

khenson
Engager

Hi.  I have log source that has a mix of various field types and then a larger nested JSON payload.  I can't quite wrap my head around how to parse this out in our SplunkCloud environment.

High level, the log contains this:

  • date field
  • server name field (separated by four dashes most of the time, but some env have three)
  • process name[PID] 
  • source code function variable field ending with a colon char
  • source code function variable's value, which may or may not have special chars like () 
  • JSON 
  • ends with []

Sanitized example:

Oct 1 20:04:22 my-web01-aa-env my-web[14597]: app.NOTICE: Gateway Transaction Response (BP) {"response":"<Auth><Field1>ch-abcd1234-ab12-ab12-ab12-abcdef123456</Field1><Field2>0123</Field2><Field3>0123</Field3><Field4>Successful Request</Field4><responseMessage><Field5>0123</Field5><Field6>0123</Field6><Field7>0123</Field7><Field8>0123</Field8><Field9>0123</Field9><Field10>0123</Field10><Field11>0123</Field11><Field12>0123</Field12><Field13>012 - Approved (APPROVAL 001077)</Field13><Field14>Approved: 012345 (approval code)</Field14><Field15>00000</Field15><Field16>Address not available (Address not verified)</Field16><Field17>40</Field17><Field18></Field18><Field19>012345</Field19><Field20>20211001</Field20><Field21>0</Field21><Field22>840</Field22><Field23>USD</Field23><Field24>2021-10-01 16:04:21.493</Field24><Field25>2021-10-01 16:04:21.493</Field25><Field26>0123456</Field26><Field27>0123</Field27><Field28>0123456</Field28><Field29>0006</Field29><Field30>ABCDEF</Field30><Field31>Abc</Field31><Field32>ABC ABC ABC</Field32><Field33>Abcd</Field33><Field34>00</Field34><Field35>ABCDEF 012345 </Field35><Field36>ABCDEF0123</Field36><Field37>4F:A0000000041010;95:0000008000;9F10:0110A040002A0000000000000000000000FF;9B:E800;91:6325A37CFBC5CEDD0012;8A:</Field37><Field38>012345012345</Field38><Field39>abcd</Field38><Field39>False</Field39><Field40 /><Field40 /><Field41 /><Field42 /><Field42 /></Field43></Field43>"} []

 

My grok-fu is not great.  Would appreciate any suggestions.  Thanks!

Labels (1)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

I made a few other corrections/assumptions about your sanitised example:

| makeresults
| eval _raw="Oct 1 20:04:22 my-web01-aa-env my-web[14597]: app.NOTICE: Gateway Transaction Response (BP) {\"response\":\"<Auth><Field1>ch-abcd1234-ab12-ab12-ab12-abcdef123456</Field1><Field2>0123</Field2><Field3>0123</Field3><Field4>Successful Request</Field4><responseMessage><Field5>0123</Field5><Field6>0123</Field6><Field7>0123</Field7><Field8>0123</Field8><Field9>0123</Field9><Field10>0123</Field10><Field11>0123</Field11><Field12>0123</Field12><Field13>012 - Approved (APPROVAL 001077)</Field13><Field14>Approved: 012345 (approval code)</Field14><Field15>00000</Field15><Field16>Address not available (Address not verified)</Field16><Field17>40</Field17><Field18></Field18><Field19>012345</Field19><Field20>20211001</Field20><Field21>0</Field21><Field22>840</Field22><Field23>USD</Field23><Field24>2021-10-01 16:04:21.493</Field24><Field25>2021-10-01 16:04:21.493</Field25><Field26>0123456</Field26><Field27>0123</Field27><Field28>0123456</Field28><Field29>0006</Field29><Field30>ABCDEF</Field30><Field31>Abc</Field31><Field32>ABC ABC ABC</Field32><Field33>Abcd</Field33><Field34>00</Field34><Field35>ABCDEF 012345 </Field35><Field36>ABCDEF0123</Field36><Field37>4F:A0000000041010;95:0000008000;9F10:0110A040002A0000000000000000000000FF;9B:E800;91:6325A37CFBC5CEDD0012;8A:</Field37><Field38>012345012345</Field38><Field39>abcd</Field39><Field40>False</Field40><Field40/><Field41/><Field42/><Field43/><Field44/></responseMessage></Auth>\"} []"



| rex "(?<datetime>\w+\s+\d+\s+\d\d:\d\d:\d\d)\s(?<server>[^\s]+)\s(?<process>[^\[]+)\[(?<pid>[^\]]+)\]:\s(?<variablefield>[^:]+):\s(?<variablevalue>[^\{]+)\s(?<json>\{[^\}]+\})\s\[\]"
| spath input=json
| spath input=response

View solution in original post

0 Karma

khenson
Engager

Thank you very much for your time and sharing this.  I will start looking at how to incorporate this in my sourcetype in SplunkCloud.

0 Karma

khenson
Engager

I see an error in my sanitized log, the last </Field43> should have been </Auth>.  This seems pretty close, but the JSON didn't appear to show up as separate fields.  I was thinking that the JSON could be parsed into:
response.Auth.Field1

respoonse.Auth.Field2

Does that make sense?  

thanks!

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

I made a few other corrections/assumptions about your sanitised example:

| makeresults
| eval _raw="Oct 1 20:04:22 my-web01-aa-env my-web[14597]: app.NOTICE: Gateway Transaction Response (BP) {\"response\":\"<Auth><Field1>ch-abcd1234-ab12-ab12-ab12-abcdef123456</Field1><Field2>0123</Field2><Field3>0123</Field3><Field4>Successful Request</Field4><responseMessage><Field5>0123</Field5><Field6>0123</Field6><Field7>0123</Field7><Field8>0123</Field8><Field9>0123</Field9><Field10>0123</Field10><Field11>0123</Field11><Field12>0123</Field12><Field13>012 - Approved (APPROVAL 001077)</Field13><Field14>Approved: 012345 (approval code)</Field14><Field15>00000</Field15><Field16>Address not available (Address not verified)</Field16><Field17>40</Field17><Field18></Field18><Field19>012345</Field19><Field20>20211001</Field20><Field21>0</Field21><Field22>840</Field22><Field23>USD</Field23><Field24>2021-10-01 16:04:21.493</Field24><Field25>2021-10-01 16:04:21.493</Field25><Field26>0123456</Field26><Field27>0123</Field27><Field28>0123456</Field28><Field29>0006</Field29><Field30>ABCDEF</Field30><Field31>Abc</Field31><Field32>ABC ABC ABC</Field32><Field33>Abcd</Field33><Field34>00</Field34><Field35>ABCDEF 012345 </Field35><Field36>ABCDEF0123</Field36><Field37>4F:A0000000041010;95:0000008000;9F10:0110A040002A0000000000000000000000FF;9B:E800;91:6325A37CFBC5CEDD0012;8A:</Field37><Field38>012345012345</Field38><Field39>abcd</Field39><Field40>False</Field40><Field40/><Field41/><Field42/><Field43/><Field44/></responseMessage></Auth>\"} []"



| rex "(?<datetime>\w+\s+\d+\s+\d\d:\d\d:\d\d)\s(?<server>[^\s]+)\s(?<process>[^\[]+)\[(?<pid>[^\]]+)\]:\s(?<variablefield>[^:]+):\s(?<variablevalue>[^\{]+)\s(?<json>\{[^\}]+\})\s\[\]"
| spath input=json
| spath input=response
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust
| makeresults
| eval _raw="Oct 1 20:04:22 my-web01-aa-env my-web[14597]: app.NOTICE: Gateway Transaction Response (BP) {\"response\":\"<Auth><Field1>ch-abcd1234-ab12-ab12-ab12-abcdef123456</Field1><Field2>0123</Field2><Field3>0123</Field3><Field4>Successful Request</Field4><responseMessage><Field5>0123</Field5><Field6>0123</Field6><Field7>0123</Field7><Field8>0123</Field8><Field9>0123</Field9><Field10>0123</Field10><Field11>0123</Field11><Field12>0123</Field12><Field13>012 - Approved (APPROVAL 001077)</Field13><Field14>Approved: 012345 (approval code)</Field14><Field15>00000</Field15><Field16>Address not available (Address not verified)</Field16><Field17>40</Field17><Field18></Field18><Field19>012345</Field19><Field20>20211001</Field20><Field21>0</Field21><Field22>840</Field22><Field23>USD</Field23><Field24>2021-10-01 16:04:21.493</Field24><Field25>2021-10-01 16:04:21.493</Field25><Field26>0123456</Field26><Field27>0123</Field27><Field28>0123456</Field28><Field29>0006</Field29><Field30>ABCDEF</Field30><Field31>Abc</Field31><Field32>ABC ABC ABC</Field32><Field33>Abcd</Field33><Field34>00</Field34><Field35>ABCDEF 012345 </Field35><Field36>ABCDEF0123</Field36><Field37>4F:A0000000041010;95:0000008000;9F10:0110A040002A0000000000000000000000FF;9B:E800;91:6325A37CFBC5CEDD0012;8A:</Field37><Field38>012345012345</Field38><Field39>abcd</Field38><Field39>False</Field39><Field40 /><Field40 /><Field41 /><Field42 /><Field42 /></Field43></Field43>\"} []"



| rex "(?<datetime>\w+\s+\d+\s+\d\d:\d\d:\d\d)\s(?<server>[^\s]+)\s(?<process>[^\[]+)\[(?<pid>[^\]]+)\]:\s(?<variablefield>[^:]+):\s(?<variablevalue>[^\{]+)\s(?<json>\{[^\}]+\})\s\[\]"
0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...