Knowledge Management

Why is _time order not maintained within my summary index?

Builder

Hello, I will try to describe this problem the best I can.

I have a summary index that is written to on a regular basis. The saved search that prepares data for the summary index extracts and prepares all necessary fields, including what should be associated with _time, and is configured through UI to write to the summary index (no collect command at the end).

When I run searches on the summary index, I've noticed that the data is not extracted in _time order, even though the timestamp field has been correctly associated with the collect. To give an example, here is a search and screenshot for an object extracted from the non-summary index:

index=non_summary id="OBJECT123"
| eval indextime = strftime(_indextime, "%Y-%m-%d %H:%M:%S")
| table id _time timestamp indextime

alt text

As you can see, events have been ordered by _time as expected. Now if I run that same search on the summary index once the data has been collected, for some reason the chronological order has not been maintained:

index=summary id="OBJECT123"
| eval indextime = strftime(_indextime, "%Y-%m-%d %H:%M:%S")
| table id _time timestamp indextime

alt text

Is there any reason why this is not working properly? Is there maybe a parameter or configuration that I am missing?

Any help would be greatly appreciated!

Thank you and best regards,

Andrew

Labels (1)
0 Karma
1 Solution

Contributor

Hi Andrew,
a summary always covers a time range of events and will summarise some value out of it. For a chronological sequence usually the first events timestamp, the last events timestamp or the time the summary was taken needs to be manually added to the search using the addinfo command, based on your reporting requirements. In a way, the summary creates a new event from the raw events and you need to define a timestamp for it.
Oliver

View solution in original post

Contributor

Hi Andrew,
a summary always covers a time range of events and will summarise some value out of it. For a chronological sequence usually the first events timestamp, the last events timestamp or the time the summary was taken needs to be manually added to the search using the addinfo command, based on your reporting requirements. In a way, the summary creates a new event from the raw events and you need to define a timestamp for it.
Oliver

View solution in original post

Builder

@ololdach Hey Oliver, thanks for the info. In the collect search I have ensured that the final table has the _time attribute so that the collect will use that. In fact, the _time attribute for each event is associated correctly.

Is there no way to maintain the correct chronological order within the summary index despite _time being associated correctly?

Thanks!

Andrew

0 Karma

Contributor

Hi Andrew, rethinking your question, your solution might be as simple as adding a | sort _time to your search that retrieves the summary data.

0 Karma

Builder

@ololdach Hey Oliver, yes this would work. I am currently using a | tstats technique by _time span=1s since the summay index is accelerated within a data model. This just seems like an unnecessary step since I expect the data to already be in chronological order within the summary index. The | sort _time approach seems it would be worse since it's super heavy when many events are involved... So if we're saying that there's no guarantee of summary indexing events in chronological order, then I must find the lesser of two (or more) evils.

0 Karma

Contributor

Hi Andrew, it all depends. The "normal" indexing processes raw events through the indexing pipeline on the indexers. The summary indexing forwards cooked/processed events from a search head and dumps them into the indexer without going through the indexing pipeline again. This in itself makes a huge difference. On a "normal" index you do expect the events to come in chronologically sorted, but it is not guaranteed. I have multiple indices, where the "natural" order of the events is haywire because of the way the data is received with delays in the processing/transfer resulting in different offsets between _time and _indextime. Thus, you are in a comfortable place, when all your "normal" indices are ordered.
Happy splunking
Oliver

0 Karma

Builder

@ololdach Hey Oliver, thanks for your input, much appreciated and very valuable!

0 Karma

Contributor

Hi Andrew,
this is just me guessing, but I would rather expect that splunk treats a summary index different from an event index due to its different nature and behaviour. Since the whole sense of a summary is to reduce the number of events, I expect the sort on the summary to be the lesser evil. Again, there is no "natural" chronology on a summary, because the "chronology" of a summary may seem to be obvious in your case, but in fact is not well defined and differs from one use case to another.
Oliver

0 Karma

Builder

@ololdach Hey Oliver, I would expect the summary index to work the same as a normal index. I mean, if I don't use stash sourcetype I incur license usage, so I would expect the same indexing approach... unless maybe summary indexing approach is different based on stash vs. non-stash... and just to clarify I'm not talking about a metric index when I say summary index. I mean a normal event index that is written to through a scheduled search with a collect action.

0 Karma