I have an event like:
{"app":"EventHub Service","caller":"kafka.go:110","fn":"gi.build.com/predix-data-services/event-hub-service/brokers.(*SaramaLogger).Println","lvl":"eror","msg":"Error closing broker -1 : write tcp 10.72.139.124:53006-\u003e10.7.18.82:9092: i/o timeout\n","t":"2017-09-13T15:26:56.762571201Z"}
I am seeing WARN messages in splunkd.log like:
09-13-2017 15:11:50.289 +0000 WARN DateParserVerbose - Failed to parse timestamp. Defaulting to timestamp of previous event (Wed Sep 13 15:10:59 2017). Context: source::/var/log/event_hub.log|host::pr-dataservices-eh-event-hub-16|eventhub:service|217289
The date is correct for each of the events, so my question is this: Should I set DATE_FORMAT for this sourcetype to clean this up? Clearly splunk is grumpy about it. Should I force splunk to take the field t as the timestamp? This sourcetypes attributes are system default, there is nothing that I am doing "locally" as far as parameters.
Any other thoughts are much appreciated!
I had a lot of those warnings. mostly because of false breaking events.
It may possible that you have adapt the LINE_BREAKER.
another possibility is that the TIME_PREFIX to be change to this:
TIME_PREFIX = \"t\":\"
By the way this configuration goes to the Universal forwarder holding the logfiles:
INDEXED_EXTRACTIONS = JSON
So I have this as my props.conf on teh HWF:
[eventhub:service]
TIME_PREFIX = \"t\":\"
MAX_TIMESTAMP_LOOKAHEAD = 75
NO_BINARY_CHECK = true
INDEXED_EXTRACTIONS = JSON
SHOULD_LINEMERGE = false
Does this make sense?
I would add following too
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%9Q%Z
OR simply
TIME_FORMAT = %FT%T.%9Q%Z
Whenever possible you should specify timestamp parsing configurations (and event breaking configurations), as if you let splunk find it automatically, it will be a more cumbersome/resource-intensive process.
And in this case, it looks like your date string is too far into the event for it to find (default is to be within the first 128 characters).
This is why I was thinking of rewriting _time with data from teh t field. Is this a bad idea?
No that will be great. You would configure that on your parsing instance (heavy fwd or indexer whichever comes first in data flow) and would fix all new data that comes in after to configure it. (will not update the already ingested data).