All Apps and Add-ons

Splunk Connect for Kubernetes is parsing json string values as individual events

svaj925
Observer

We have applications running on OpenShift platform (v 3.11) and logs are written to STDOUT.  We have setup Splunk Connect for Kubernetes to forward logs from OpenShift to Splunk instance. Setup is working and we are able to see and search logs in Splunk but the issue we are facing is that all lines in Application logs are displayed as individual events in Splunk search, for example if there is some json string in logs then each value in json output is displayed as individual events as shown in below screenshot.

screenshot.png

Here is the raw log:

 

 

 

 

"sessionInfo" : {
"userId" : "",
"sessionId" : "bK7xzM16bpLXvaUGaWIODThJm9A",
"ipAddress" : "",
"endpoint" : ""
}
}

 

 

 

 

 I suspect some intermediary component between Openshift and splunk is doing this wrapping which might be throwing off the parser, that's my take but I'm not entirely sure about how Splunk Connect for Kubernetes is handling this. Any help/suggestions in fixing is greatly appreciated.

Thanks in advance!

 

Note: Similar app logs which are forwarded to Splunk instance using universal forwarder from linux machine are displayed in correct format in Splunk search.

Labels (1)
0 Karma

schose
Builder

Hi,


This sounds like a typical multiline issue. Check crio logs on the worker node where your application pods are running.  If the displayed json is line by line - what i would guess - then you will need to setup multiline support in fluentd.. https://github.com/splunk/splunk-connect-for-kubernetes/blob/8ba455e7f988d129c9bfa13fdcf2025c6c4ae72...

.. meaning you need to concat the event to fluentd before sending it to hec.

Another technical working solution is to fix the events in splunk on hec side. When your sourcetype for the damaged application is brokensourcetype set at SCK you can rename it, write it to aggQueue and so fix it with splunk props.conf capabilities. 

props.conf

[brokensourcetype]
TRANSFORMS-streams =renamest,write-agg

[jsonfixed]
... fix your json here 

transforms.conf:

[write-agg]
REGEX = .*
DEST_KEY=queue
FORMAT=aggQueue

[renamest]
REGEX = .*
FORMAT = sourcetype::jsonfixed
DEST_KEY = MetaData:Sourcetype

Regards,

Andreas

Tags (1)
0 Karma

svaj925
Observer

I tried adding multiline as below (since our json output starts with "{")

 

 

multiline:
        firstline: /^{/
        flushInterval: 1

 

 

Now we can see json output parsing correctly as single event in some occurrences and in some cases, there are additional lines included in the same event (see below example).

screenshot1.png

Is there an option to include something like last/end line parameter to multiline options? or any other way to fix this. 

Appreciate your help/suggestion regarding this issue.

0 Karma

mattymo
Splunk Employee
Splunk Employee

Fluentd concat filter has many settings that should help you here...

 

https://github.com/fluent-plugins-nursery/fluent-plugin-concat#parameter

 

Any luck with your logs yet?

- MattyMo
0 Karma

svaj925
Observer

In container logs on worker node, json is line by line and also  "/n" is included for each line

 

{"log":"{\n","stream":"stdout","time":"2021-01-07T23:37:54.053738972Z"}
{"log":"  \"payload\" : \"\",\n","stream":"stdout","time":"2021-01-07T23:37:54.053746532Z"}
{"log":"  \"sessionInfo\" : {\n","stream":"stdout","time":"2021-01-07T23:37:54.053751058Z"}
{"log":"    \"userId\" : \"\",\n","stream":"stdout","time":"2021-01-07T23:37:54.053755004Z"}
{"log":"    \"sessionId\" : \"WT-3bF-VK35ZlXvqvUzT3kdkRsc\",\n","stream":"stdout","time":"2021-01-07T23:37:54.053759095Z"}
{"log":"    \"ipAddress\" : \"\",\n","stream":"stdout","time":"2021-01-07T23:37:54.053763148Z"}
{"log":"    \"endpoint\" : \""\n","stream":"stdout","time":"2021-01-07T23:37:54.053767198Z"}
{"log":"  }\n","stream":"stdout","time":"2021-01-07T23:37:54.053771398Z"}
{"log":"}\n","stream":"stdout","time":"2021-01-07T23:37:54.053780998Z"}
0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...