Getting Data In

Is it possible to preserve original order of events?

Splunk Employee
Splunk Employee

Would someone kindly confirm if Splunk is expected to preserve the order of events as they are presented in the original log file during indexing? If it is not, is there a setting to force it to preserve order?

We are observing events being indexed out of original order when there is no sub-second resolution on the timestamp and hundreds/thousands of events are generated per second.

Tags (1)
1 Solution

Splunk Employee
Splunk Employee

Events are persisted to disk in their original arrival order. Events are retrieved in inverse time order, with inverse arrival order used to break ties.

The "code" of the event, stored in the field _cd stores the pair (bucket id, arrival address). You can sort by ascending arrival address to see events in arrival order.

View solution in original post

Splunk Employee
Splunk Employee

Events are persisted to disk in their original arrival order. Events are retrieved in inverse time order, with inverse arrival order used to break ties.

The "code" of the event, stored in the field _cd stores the pair (bucket id, arrival address). You can sort by ascending arrival address to see events in arrival order.

View solution in original post

Splunk Employee
Splunk Employee

Arrival order is in-file order, for any given source.

0 Karma

Splunk Employee
Splunk Employee

Thank you, Stephen. If the log file is being collected by a Splunk forwarder, what is the relation between arrival order and original file order? Can we at least expect these to be the same per source? A customer is reporting the original order is not preserved, but I am not yet able to reproduce on a standalone Splunk instance without a forwarder.

0 Karma

Super Champion

This may not directly answer your question, but it's related:

Splunk Employee
Splunk Employee

Thank you guys. In our case, I do not believe there are more than several thousand events per second--way less than the 100k limit.

0 Karma

Super Champion

It seemed to me like these topics were related, but perhaps they are not. It seems like splunk generally does preserve ordering, so I guess I just figured that splunk assigned sequential values _cd or something, and things start breaking after so many hundreds of thousands of events on a single second.... but I could be way off. Perhaps these are not related at all.

0 Karma

Splunk Employee
Splunk Employee

I think the issue is that there are several events (say just a couple of thousand) with the same timestamp (no subseconds) in the same file, and they want to know if Splunk will return results in the order in which were encountered in the file. My guess is that there is no such guarantee.

0 Karma