I'm a bit lost. Every piece of info that I find on the web (as well as materials from the Splunk's own trainings) say that UF does only very limited input preparation (line breaking, metadata adjustment, character encoding) but no real parsing work.
Thus I'm confused to find in logs that, for example:
12-03-2021 15:50:44.906 +0100 WARN DateParserVerbose - Failed to parse timestamp in first MAX_TIMESTAMP_LOOKAHEAD (128) characters of event. Defaulting to timestamp of previous event (Fri Dec 3 15:50:36 2021). Context:[...]
That would mean that some timestamp parsing does take place.
But do I still need to put my timestamp extraction config on the HF? (I use UF -> HF-> idx)
How does it correspond to possible settings about breaking on timestamp? If I want to break (which should happen on UF, right?) on timestamp, do I need to provide timestamp format on both UF (for breaking) and HF (for parsing)?
It depends, if you have sourcetype which is using INDEXED_EXTRACTIONS then those data goes through StructuredParsing pipeline which means datetime parsing happen on UF. See https://wiki.splunk.com/Community:HowIndexingWorks (4. Detail Diagram - UF/LWF to Indexer)
This UF is ingesting Exchange logs. If I remember correctly, they are structured but they don't use indexed extractions (but I might be wrong; I don't have this evironment at hand to check it quickly).
But in general, if I want to add another sourcetype, with no indexed extractions, I should normally add timestamp-related props on HF, right?
Thanks for the link!