Getting Data In

How to parse Docker logs with multiple events from stacktrace?

lassemammen
Explorer

When using the Docker Splunk logging driver to send events into the http collector splunk logs individual logs like this:

{"line":"the message","source":"stdout","tag":"container tag"}

Unfortunately, for stacktraces from tomcat/log4j, it will separate them into multiple log events per line for the stacktrace, bottom line first, like this:

{"line":"\tat java.lang.Thread.run(Thread.java:745)\r","source":"stdout","tag":"33f25cc98f0c"}
{"line":"\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)\r","source":"stdout","tag":"33f25cc98f0c"}
{"line":"\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)\r","source":"stdout","tag":"33f25cc98f0c"}

This makes is it nearly impossible to use. Does anybody know a way to either combine them in Splunk or get tomcat to spit them out in a single line?

Hope someone out there are able to help.

sloshburch
Ultra Champion
0 Karma

onimishra
Explorer

I've finally found a way around the problem, at least for Splunk.

Note that with this solution, you'll not be able to use any of the added labels that the Splunk Log driver gives you. In my team, we got around this by including all the information within the log statement (through either the MDC or the logging configuration within the service in the container).

Instead of using the Splunk Log driver (which leaves little control about what to do), we used the Syslog driver. We configured the Syslog driver to send its data via UDP to a normal Splunk installation. Remember to setup Splunk to listen for data. This can be done in Settings -> Data Input -> UDP (under Local inputs) and then hit new. Set the port to 1514. If you are running your Splunk in a docker container, make a port binding that maps 514 to 1514 (as 514 is the standard syslog port). Create a new source type in the next screen, and name it something you can remember, as we need to configure it afterwards.
Now, when a log event occurs within your service container, it is logged to syslog which sends it to Splunk. It does however come out with the syslog data (date, tags etc) prepended to your log statement and each line is interpreted as its own event.

When the UDP input has been created, it is time to configure the source type. Go to Settings -> Source Types and search for the source type you've just created.
First, lets remove the syslog data from the log statement. In the Advanced section, click "new setting" at the bottom. Name it SEDCMD-01. This will run an SED command, which allows us to modify the input prior to indexing (based on rules described here). On my unix system the syslog data would look something like this

[{date} something=docker horse=horse]: Start my log statement

To remove this, I set the value of my SEDCMD-01 to "s/.+]: //g" without quotes. This takes everything that ends with ]: and removes it, and as that is what end out syslog statement, we now only have the pure log statement. This also means that any tags that is configured with the Syslog Driver is removed prior to indexing, so keep all the extra information you want within your core log statement.

All of my own log statement start with a date and time like "2016-06-30 08:15:30,423". I use this create an event break point. In the Event Breaks section of the source type, I put in "\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d" without quotes in the pattern box. Doing so creates a BREAK_ONLY_BEFORE in my advanced section (so, the break will be before my date by standard). Note that this is not a line break, but event break. That means it will keep gathering data, until it reaches a new date.

In my experience (running Splunk in a docker container), I would have to restart the container in order for my changes to take effect. Otherwise, you should now be done!
I've attached below some images of my configuration in the source type, and of what a log statement looks like in my Search and Report.

I hope this helps. Please ask more questions if it doesn't, so that we can get more people to use docker and splunk together 🙂

Source Type

Log Statement

mahendra5
New Member

@onimishra that is well explained and working great as far as rsyslog timestamp is not using milliseconds.
we introduced milliseconds in rsyslog and now the stacktraces are not aggregated as it does without miliseconds.

Using Docker syslog logging driver:

Case1: stack traces are captured well on splunk
rsyslog.conf/docker logs time stamp Format is -- > Jul 13 17:03:59

Case2: stack traces are again broken when
rsyslog.conf/docker logs time stamp Format is -- > 2017-07-13 19:02:49.549

0 Karma

nickperry
Explorer

I downvoted this post because just the wrong way to approach the problem in my opinion, at least in the case that you control the application code. i guess there may be some merit to this syslog approach in a siloed environment.

0 Karma

onimishra
Explorer

What application code are you referring to here - my service within my docker container, docker or splunk? I have made no changes to my own application code to get this to work and it is only the syslog from inside the docker container that is forwarded to the splunk server. If you have concerns about other applications or services sending data to port 514 on the splunk server, then bind it to something else and have the syslog driver send it to the new port.
Within your application you have access to the container id which you could put into the MDC during startup, in order to tell your instances from one another in the log.
If you want to centralise the splunk server, have a forwarder run on each machine that collects the data. See this for link for more on that approach. It is what my solution is based on.
http://blogs.splunk.com/2015/08/24/collecting-docker-logs-and-stats-with-splunk/

0 Karma

lassemammen
Explorer

@onimishra There is definitely something to be said for not changing application code for specialised logging. My proposed change would render output virtually unusable for developers - could be solved by multiple loggers. Your approach is neat because it avoids containers going down or failing to start if Splunk is down or being upgraded (see https://github.com/docker/docker/issues/21966)

However, if you want to ensure log output is going to splunk UDP/syslog wouldn't fulfil this requirement (no ack). So as anything I would say it depends on your priorities and requirements. We have splunkforwarders running on each docker host and sending docker logs directly into them, so I would prefer modifying the code and using the HTTP event collector.

0 Karma

lassemammen
Explorer

Cheers @nickperry, that's exactly what I've done now. For anybody interested, it seems the easiest way with log4j is to implement your own ThrowableRenderer, this is what we're using now:

package com.mypackage;

import org.apache.log4j.EnhancedThrowableRenderer;
import org.apache.log4j.spi.ThrowableRenderer;

public class CustomThrowableRenderer implements ThrowableRenderer {

    EnhancedThrowableRenderer renderer = new EnhancedThrowableRenderer();

    @Override
    public String[] doRender(Throwable t) {
        String[] output = new String[1];

        for (String line : renderer.doRender(t)) {
            output[0] += line.replace("\n", "") + "\\n";
        }

        return output;
    }
}

and add it to log4j custom as log4j.throwableRenderer=com.mypackage.CustomThrowableRenderer. Should be possible then to get splunk to interpret '\n' as a newline.

kjoyner
Engager

Can you expand on "Should be possible then to get splunk to interpret '\n' as a newline". I was able to get the stacktraces to splunk with the '/n' in the output. However, Splunk is just showing me the logs with '/n' as a character in the line, not treating it as a newline. Makes reading the stacktraces difficult. Is there a trick to get splunk to interpret the '/n' as a newline.

onimishra
Explorer

Well, if the interest is in the "line", I personally find it annoying that it is wrapped within a json object, instead of just outputting the line, when I make my searches.

0 Karma

lassemammen
Explorer

Just use | spath path=line output=_raw in your searches

0 Karma

onimishra
Explorer

But the indexer is unable to do that, and parse only the "line" segment, right?

0 Karma

lassemammen
Explorer

Not sure what you're asking but if you mean field extractions that works fine for me by adding | extract to the query after the previously mentioned spath

0 Karma

nickperry
Explorer

Cool. Thanks for the code. We have some applications that should benefit from this. Will try it shortly.

0 Karma

onimishra
Explorer

How do you feel about the Search and Report statements being wrapped in JSON? (read this in a scientifically interested tone, not a judging one 😉 )

0 Karma

nickperry
Explorer

Can you clarify what you mean?

0 Karma

nickperry
Explorer

Whichever Docker log driver you use you'll face the same issue. Docker treats each line flushed to stdout / stderror as a separate event. This is perfectly reasonable behavior as it cannot know what constitutes the start and end of a multi-line log event.

I came to the conclusion the easiest way to address this was at source. Alter your log4j layout to escape the newlines in stack traces.

tgotzsche
Engager

Splunk-guys - you really need to answer this....we have the same issue and are just about to purchase a Splunk Ent. license. But this is a show stopper for sure.....

.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!