Getting Data In

Syslog Json question

willemjongeneel
Communicator

Hello,

I've got a question on getting Splunk to extract key value pairs from syslog json events.

The events look like this:

<14>Mon Aug 12 12:29:29 UTC 2019Info: { //json part}\x00

At first I tried with the standard _json sourcetype. This didnt work. So I tried to make a custom sourcetype that would remove the part before and after the json.

I've tried to add
SEDCMD-end=s/\x00//g
SEDCMD-start=s/^[^{]+//g
KV_mode=json

When I test the sourcetype using the add data wizard in Splunk web, I see the part before the json en after the json dissapear. After I changed the sourcetype to my custom sourcetype in the source of the data, this doesnt work and I still get events with the part before and after the json.

The full sourcetype conf:

ADD_EXTRA_TIME_FIELDS=True
ANNOTATE_PUNCT=true
AUTO_KV_JSON=true
BREAK_ONLY_BEFORE_DATE=true
CHARSET=UTF-8
DEPTH_LIMIT=1000
KV_mode=json
LEARN_MODEL=true
LEARN_SOURCETYPE=true
LINE_BREAKER=([\r\n]+)
LINE_BREAKER_LOOKBEHIND=100
MATCH_LIMIT=100000
MAX_DAYS_AGO=2000
MAX_DAYS_HENCE=2
MAX_DIFF_SECS_AGO=3600
MAX_DIFF_SECS_HENCE=604800
MAX_EVENTS=256
MAX_TIMESTAMP_LOOKAHEAD=128
NO_BINARY_CHECK=true
SEDCMD-end=s/\x00//g
SEDCMD-start=s/^[^{]+//g
SEGMENTATION=indexing
SEGMENTATION-all=full
SEGMENTATION-inner=inner
SEGMENTATION-outer=outer
SEGMENTATION-raw=none
SEGMENTATION-standard=standard
SHOULD_LINEMERGE=true
TRUNCATE=10000
category=Custom
description=Sourcetype voor SAM, dit haalt de extra syslog informatie weg en toont alleen de JSON
detect_trailing_nulls=false
disabled=false
maxDist=100
pulldown_type=true

Extra information:

This gets send to Splunk Cloud from a forwarder that receives this events over a TCP port. On the forwarder the port gets connected to the right index, and sourcetype.

Can anyone advise me on how to get the key value pairs from these syslog/json events?

Thank you in advance, kind regards,
Willem

Tags (3)
0 Karma
1 Solution

woodcock
Esteemed Legend

Here is the basic approach.
Figure out how to modify your events so that they are VALID JSON. Use this tool to check: https://jsonlint.com/
Once you know how to adjust them, fix them on the way in using SEDCMD- or other transforms.
DO NOT USE THE _json SOURCETYPE! Create your own sourcetype and use KV_MODE = json in props.conf.
That's it.

View solution in original post

0 Karma

woodcock
Esteemed Legend

Here is the basic approach.
Figure out how to modify your events so that they are VALID JSON. Use this tool to check: https://jsonlint.com/
Once you know how to adjust them, fix them on the way in using SEDCMD- or other transforms.
DO NOT USE THE _json SOURCETYPE! Create your own sourcetype and use KV_MODE = json in props.conf.
That's it.

0 Karma

willemjongeneel
Communicator

Hello,

What do you mean by fix them on the way? Is this possible to do this by using the sourcetype wizard in splunk web? Or do I really need to access props.conf directly? Or is it necessary to have a HF in between to do this?

Event format:
Mon Aug 12 12:29:29 UTC 2019Info: { //json part}\x00

Also, with SEDCMD I can remove the first part with "s/<.{1,40}Info:\s//g"
For the last part I tried: "s/\x00//g" This somehow doesn't work. Do you have any idea why this is not working?

Kind regards,
Willem

0 Karma

woodcock
Esteemed Legend

Try adding additional \\ characters one by one until it works.

0 Karma

willemjongeneel
Communicator

Hello,

Took a while, but this worked for me.

Thank you for your help!

Kind regards,
Willem Jongeneel

DavidHourani
Super Champion

Hi @willemjongeneel,

Have you tried using the spath command ?
https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Spath

You won't need any sed to apply it.

0 Karma

willemjongeneel
Communicator

Hello David,

I dont fully understand how to use this spath command. Should I extract the json and use that as the input field? Is this only possible at search time?

Can you maybe explain a little more on how to approach this?

Thanks, kind regards,
Willem

0 Karma

DavidHourani
Super Champion

Hi @willemjongeneel,

Yes you can use this command on the search interface. It will allow you to troubleshoot why the KV_MODE =json isn't giving you any results and you'll know exactly what you need to keep from your raw data to get the extraction working.

Once you identify that you can apply the right sed to reshape your data. You can also use INDEXED_EXTRACTIONS = JSON instead of KV_MODE = json for better performance.

0 Karma

willemjongeneel
Communicator

Hello,

Thank you.

I got this working using a substring and spath. The full search is:

index= | eval _raw=substr(_raw, 39, (len(_raw)-42)) | spath input=_raw

This cuts off the part before and after the json. Is there a way to get this substring working from props.conf by using Splunk web (as I cannot change it in another way, because I'm using Splunk Cloud).

Kind regards,
Willem

0 Karma

DavidHourani
Super Champion

Well you could use the sedcmd you already created to remove the un-wanted subtring on the HF before sending data to Splunk cloud. Include this as well : INDEXED_EXTRACTIONS = JSONto replace spath.

0 Karma

willemjongeneel
Communicator

Hello David,

We are using universal forwarder, not heavy forwarder. Would this be possible using a universal forwarder?

Kind regards,
Willem

0 Karma

DavidHourani
Super Champion

No, just on an HF, or you'll have to put the config on the indexers but you'll have to access props.conf file... so maybe get support to do that for you ?

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...