My clients are sending logs to splunk via syslog-ng using the following destination:
destination d_splunk-test-tcp {
tcp('splunk-host' port(1234) template('$ISODATE $HOST $PROGRAM[$PID]: $MSG'));
};
My logs look something like this, but larger at around 1000-1500 characters:
2013-07-26T13:57:03+00:00 client-host some_process[12345]: [INFO] This is a message
So far, the only way I've been able to get it to work perfectly is to add some unique characters at the end of the message, then use splunk's LINE_BREAKER to separate based on that unique text. I suppose this could work, but it seems like a poor solution.
My research tells me that date/time recognition must not be working, but I can't find any combination of TIME_FORMAT directives that seem to work. The closest I've been able to get is that all of my events will start with the timestamp, but some of the log messages get merged together in one event. If I have really short messages, they usually end up in an event by themselves. I thought this was an issue with MAX_TIMESTAMP_LOOKAHEAD , but that didn't solve the problem either.
Here's what my props.conf looks like:
[testing]
TIME_FORMAT = %Y-%m-%dT%H:%M:%S%:z
MAX_TIMESTAMP_LOOKAHEAD = -1
Performance isn't a big deal at the moment. I've got a bunch of servers I can throw at this if needed. What else can I try to keep my messages from getting merged together?
EDIT, Here's some more info:
I wrote this simple script to send 10 messages rather quickly to syslog-ng, which then forwarded the messages to splunk:
# for NUM in {1..10}; do echo "message ${NUM}" | logger -i -t testing; done
I confirmed with tcpdump that each message was sent in a separate packet. Here's a snippet of a few of the packets (tcpdump -n -nn - ieth0 -s0 -A port 9876):
10:11:13.487966 IP 10.144.128.24.18446 > 10.144.9.73.9876: P 63:126(63) ack 1 win 14600
E..g..@.>..3
...
. IH.&.....CLc.P.9.....2013-07-26T17:11:13+00:00 las1-app077 testing[17191]: message 1
10:11:13.487977 IP 10.144.9.73.9876 > 10.144.128.24.18446: . ack 126 win 5840
E..(..@.@..|
. I
...&.H.CLc.....P.......
10:11:13.489953 IP 10.144.128.24.18446 > 10.144.9.73.9876: P 126:189(63) ack 1 win 14600
E..g..@.>..2
...
. IH.&.....CLc.P.9..y..2013-07-26T17:11:13+00:00 las1-app077 testing[17193]: message 2
10:11:13.489960 IP 10.144.9.73.9876 > 10.144.128.24.18446: . ack 189 win 5840
E..(..@.@..{
. I
...&.H.CLc....DP.......
10:11:13.491520 IP 10.144.128.24.18446 > 10.144.9.73.9876: P 189:252(63) ack 1 win 14600
E..g..@.>..1
...
. IH.&....DCLc.P.9..:..2013-07-26T17:11:13+00:00 las1-app077 testing[17195]: message 3
And here's what it looks like in splunk. Please note that this is all a single event in the splunk UI:
2013-07-26T17:11:13+00:00 las1-app077 testing[17191]: message 12013-07-26T17:11:13+00:00 las1-app077 testing[17193]: message 22013-07-26T17:11:13+00:00 las1-app077 testing[17195]: message 32013-07-26T17:11:13+00:00 las1-app077 testing[17197]: message 42013-07-26T17:11:13+00:00 las1-app077 testing[17199]: message 52013-07-26T17:11:13+00:00 las1-app077 testing[17201]: message 62013-07-26T17:11:13+00:00 las1-app077 testing[17203]: message 72013-07-26T17:11:13+00:00 las1-app077 testing[17205]: message 82013-07-26T17:11:13+00:00 las1-app077 testing[17207]: message 92013-07-26T17:11:13+00:00 las1-app077 testing[17209]: message 10
... View more