Getting Data In

Timeout on multiline parsing?

PickleRick
SplunkTrust
SplunkTrust

Hello.

I'm trying to understand something.

I have a monitor input reading a file from a tk10x logger (a part of OpenGTS). It logs some data beginning with a line containing "Recv:" string and following that with some data parsed by the receiver from a network-received frame.

I have therefore a sourcetype defined with:

[opengts-events]
BREAK_ONLY_BEFORE = Recv:
BREAK_ONLY_BEFORE_DATE = true
DATETIME_CONFIG =
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = true
category = Custom
disabled = false
pulldown_type = 1

(then come some extracts but they are not important here).

To be honest, I'm not sure why I have BREAK_ONLY_BEFORE_DATE set (I configured the sourcetype from the webui). But for now the sourcetype looks like that.

With this setup I'd expect the input to split only on lines containing "Recv:" string. And for most of the time it does so. But also there are some breaks in other places.

For example, this excerpt from log:

(event data from few hours earlier)
[INFO_|03/03 00:19:33|ServerSocketThread$ServerSessionThread.handleClientSession:2895] (ClientSession_000) Remote client port: /1.2.3.4:15352 [to /1.2.3.4:31272]
[INFO_|03/03 00:19:33|AbstractClientPacketHandler.printSessionStart:273] Begin TCP session (ClientSession_000): 1.2.3.4
[INFO_|03/03 00:19:33|TrackClientPacketHandler.getHandlePacket:451] Recv: ##,imei:12345678,A
[INFO_|03/03 00:19:33|TrackClientPacketHandler.getHandlePacket:461] TK103-2 Header: ##,imei:12345678,A
[INFO_|03/03 00:19:33|TrackClientPacketHandler.getHandlePacket:478] Logon ACK: LOAD
[INFO_|03/03 00:19:33|ServerSocketThread$ServerSessionThread.handleClientSession:3004] (ClientSession_000) TCP Resp Asc: LOAD
(next event starting with line containing Recv:)

This log gets split properly at the line containing "Recv: ##,imei:12345678,A", but:

1. First two lines are getting extracted as an event separate from the earlier data even though it doesn't start with a line including "Recv:" (as I wrote, the last data written before that was from few hours earlier). Why? Does the input monitor just have some timeout that it hits and therefore flushes the buffer?

2. If I sort the events by _time in search, I get the "later" event (the last four lines from the excerpt shown above) shown earlier than the one consiting of the first two lines. Strange.

 

Labels (2)
Tags (1)
0 Karma
Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...