Hi all,
I have an XML feed that returns data like this:
<feed lastUpdate="1438698061185" version="2.0">
<doc>
<num>1</num>
<num2>13</num2>
</doc>
<doc>
<num>1</num>
<num2>13</num2>
</doc>
</feed>
As you can see, the timestamp (lastUpdate=) is at the top of the document, which contains 2 events (in this example). The actual doc is much larger (1000+ events), so trying to avoid splitting events at search time (using spath).
Is there any way I can assign this time to each event at index time? Or do I need to pre-process the XML doc?
Thanks!
You have 2 options:
1: If the dumping of the data is in near-realtime, you can use DATETIME_CONFIG = CURRENT
which will cause Splunk to timestamp your event with the Indexer's current system time (so that _time = _indextime).
2: Configure linebreaking and timestamping in the normal way and, believe it or not, it will actually work as you would like but the downside is you will get a huge number of logs like this:
2014 22:22:16.138 +0000 WARN DateParserVerbose - Failed to parse timestamp. Defaulting to timestamp of previous event (Wed Oct 22 22:22:14 2014). Context: source::XXX|host::YYY|ZZZ|3549
Because it defaults to the timestamp of the previous event: WIN!
You have 2 options:
1: If the dumping of the data is in near-realtime, you can use DATETIME_CONFIG = CURRENT
which will cause Splunk to timestamp your event with the Indexer's current system time (so that _time = _indextime).
2: Configure linebreaking and timestamping in the normal way and, believe it or not, it will actually work as you would like but the downside is you will get a huge number of logs like this:
2014 22:22:16.138 +0000 WARN DateParserVerbose - Failed to parse timestamp. Defaulting to timestamp of previous event (Wed Oct 22 22:22:14 2014). Context: source::XXX|host::YYY|ZZZ|3549
Because it defaults to the timestamp of the previous event: WIN!
Oh! This is really interesting, I did not know of that behaviour.
I could use index time for timestamp for this use case, however, was just checking if I had missed something. This is super useful to know. Thanks!
Agreed, you would have to pre-process the xml file before dropping it to monitored folder OR you can setup a scripted input which will do this processing and send events to Splunk.
How much delay you see between the value in header-lastUpdate versus time when files is dropped to monitored folder?
Thought so. I could use index time for timestamp for this use case, however, was just checking if I had missed something Splunk could have done.
I don't know of any way to do this outside of pre-processing it.