(Trying to pull a few similar discussions together and recorded for posterity)
The current Docker Logging Driver for Splunk sends HTTP events to Splunk as single-line JSON events because Docker treats each line flushed to stdout / stderror as a separate event (thanks @halr9000). Example:
access_combined
. This means that you're in a pickle trying to sourcetype the line
payload. How can this be addressed to enjoy the power of my existing sourcetypes with this HTTP Event Collector payload from Docker?
The strongest solution is in the works! That is for the Docker Logging Driver for Splunk to transmit HTTP Event Collector in raw mode (rather than json), so the events won’t get surrounded by JSON and our normal field extraction stuff will work.
Our yogi @Michael Wilde has been tracking a PR with Docker for specifically this. If and when that's implemented, I hope to update this accordingly.
How can the fields which are separated by colon like “line” , “tag” and “source” be extracted automatically on source=http:docker for Docker logs while using Http Event Collector , also if the docker logs have the Key Value in the logs how can those appear as fields in Splunk?
For example the log has the following :
{ [-]
line: 2016-11-14 15:22:03,779; [LOG=debug, NAME=bhav, TIME=1,MSG=Goodday, CLIENT=127.0.0.1]
source: stdout
tag: abc02be1be4e
}
I need to see line , source and tag as fields , along with that KV pair should also showup fields like LOG, NAME, MSG and CLIENT .
Can this be done if so how ? We would want a permanent solution so that it can be applied Enterprise wise.
Keep in mind that that raw events are only supported in Splunk 6.4 and onwards
Yes! Thanks for adding that info!
Just to clarify: this solution solves challenge 1, not 2. Multi-line events like stack traces are still not handled properly as stderr/stdout streams from different containers are interleaved as they are aggregated by Docker logging driver.
When logs are being forwarded from the filesystem the indexer is able to join line like stack traces with the appropriate sourcetype. What is the indexer using to determine that the lines can be joined, is it the "source"? If so, is it possible to have the log driver stream the logs to the indexer with some unique identifier for the container source? Or am I misunderstanding the mechanics of the line joining?
You can certainly try but I believe this comes down to the way the payload is "cooked" or "parsed" by Splunk.
Ultimately, I believe you can do as you describe but if there is too much delay or if you define the sourcetype wrong then it will not work. Today, Docker sends each line of the stack trace as individual events.
Rumor is that Docker is exploring switching to Plugin model rather than Driver model for this so maybe this will all change anyway.
If you try this out, let us know how it works.
@rarsan - Are you sure? I thought the logging driver sends data from the container itself and so different containers send different streams.
Is there any updates to multi-line events. Searched around but this is the closest post that discussed this.
@Michael Wilde - Is this because Docker still spits out each line individually or has this been adjusted on the docker side so as to send a multiline output as one event?
Correct @SloshBurch log messages are lines. Docker won't solve this by nature. Multiline event aggregation isn't something many log tools other than splunk do well. Our log driver would almost require a "little splunk" inside it to properly aggregate events. There isn't a reason why a customer can't implement the HEC within their own app (running inside the container)
@Michael Wilde can you clarify if splunk is working on a fix for issue#2 (ie multi-line stacktraces). The big problem with your statement "here isn't a reason why a customer can't implement the HEC within their own app (running inside the container)" is that nowadays with so many docker containers being published directly on dockerhub etc, if any of these applications produce multi-line outputs they don't work with your docker-splunk logging driver. It's not practical to get all these pre-built docker images to change and add support for the splunk HEC appenders. I think the only place that is capable of fixing this issue is directly in the splunk docker logging driver. It would somehow need to aggregate the events there first before sending to splunk or perhaps have some additional capabilities on the server side to merge them together using the container id to ensure logs from different containers aren't merged together.
In my case I have support for docker under a RedHat agreement as well as support for splunk Enterprise. Where is the underlying issue being tracked? is there a bug opened already for the splunk docker driver?
Do we need another new topic started for this second issue to track it as it clearly isn't solved. 😞
Note, I don't see docker itself ever being able to fix this issue since stacktraces will always be on multiple lines, the only other thing I could think of would be that if the logging drivers were somehow updated to put a special character for newline instead of newline itself, but then even if you did that you would run into the issue where docker cannot send long lines (i think it's a 16k limitation right now). We need a workable solution for this issue, can splunk help?
Let me see if I can get the folks who are now working on Docker to elaborate on this.
Hmm @michael wilde - how would that be different than the HEC collecting the data outside the container (today's common method).
It sounds like you are saying that docker doesn't even provide a mechanism to change the out it outputs to chunks in its standard out.
From @dgladkikh: Raw format has been merged to master https://github.com/docker/docker/pull/25786
So it should be available in 1.13 and it is possible to try it with experimental docker or custom build from master
Slight update that it sounds like 1.13 is more easily accessible these days and you can start using the new driver: http://blogs.splunk.com/2016/12/01/docker-1-13-with-improved-splunk-logging-driver
Looks like 1.13 just recently came out of beta!