All Apps and Add-ons

Help parsing JSON embedded in fields from DB Connect input

scottsavareseat
Path Finder

I'm using DB connect 3.2 on Splunk 8.0.1. I have an input that brings in data from a mysql database. It looks like thins:

2020-03-17 10:29:16.000, f1="10", f2="2020-03-17 10:29:16", f3="something", f4="{"really": ["long"], "and": {"deep": "json"}}", f5="casdca", f6="more"

Field f4 contains about 13k of JSON. When viewing the event right now, f1, f2, f3 come out find. Splunk can't see f4 as containing JSON so it isn't parsed. f5 and f6, which you'd think are parsed right are not. They appear as being part of the value of the JSON f4 field.

What I would like is a way to have f5 and f6 properly parsed as fields and f4 to be expanded fully as their own JSON fields so that I can pull out fields as part of the search. Being a huge bit of JSON search time extraction is preferred.

I've tried setting KV_MODE to JSON and I've tried playing with transforms logic:
[forseti_violations_all_fields]
REGEX = , ([a-zA-Z0-9-_]+)="(.*)"(, |$)
FORMAT = $1::$2
But none of it has worked at all.

Any ideas? Thanks!

0 Karma
1 Solution

scottsavareseat
Path Finder

I'm not suggesting this is an answer. However, as a workaround I developed a script that will pull from the mysql database, take the already json fields and convert them to python objects, then upload the event via Splunk HEC. The script itself is about 110 lines including comments and is reusable.

View solution in original post

0 Karma

scottsavareseat
Path Finder

I'm not suggesting this is an answer. However, as a workaround I developed a script that will pull from the mysql database, take the already json fields and convert them to python objects, then upload the event via Splunk HEC. The script itself is about 110 lines including comments and is reusable.

0 Karma

woodcock
Esteemed Legend

Perfect! Click Accept on your answer.

0 Karma

scottsavareseat
Path Finder

I'm wondering if it is easier to tell DB Connect to just write the event as a JSON blob to HEC instead of using field=value format. Is that possible?

0 Karma

woodcock
Esteemed Legend

Your RegEx is wrong; try this:

REGEX = ,\s+([a-zA-Z0-9-_]+)="(.*?)"(?=, |$)

See here:
https://regex101.com/r/bV62Ui/1

0 Karma

scottsavareseat
Path Finder

Thanks for the regex, but it doesn't work in a real example. Looks like the JSON sample above isn't truely indicative of the data. The JSON is much longer and has a lot of quote comma space patterns in it:
2020-03-17 10:29:16.000, f1="10", f2="2020-03-17 10:29:16", f3="something", f4="{"really": ["long"], "and": {"deep": "json", "test": "foo"}}", f5="casdca", f6="more"
Even that is wrong for some reason. In the real data f4 extracts to just '{' and I don't see f5 and f6 getting extracted at all.

I'm wondering if there is a way to tell DB Connect to submit the entire event as JSON instead of field=value.

Thanks, any more ideas?

0 Karma

to4kawa
Ultra Champion
0 Karma
Get Updates on the Splunk Community!

Splunk Forwarders and Forced Time Based Load Balancing

Splunk customers use universal forwarders to collect and send data to Splunk. A universal forwarder can send ...

NEW! Log Views in Splunk Observability Dashboards Gives Context From a Single Page

Today, Splunk Observability releases log views, a new feature for users to add their logs data from Splunk Log ...

Last Chance to Submit Your Paper For BSides Splunk - Deadline is August 12th!

Hello everyone! Don't wait to submit - The deadline is August 12th! We have truly missed the community so ...