Hello,
When I write data to a summary index, the timestamp (_time) always follows the earliest time.
For example, if my daily scheduled search runs at 1am today, 9/15/2024, to write the last 24-hour data to a summary index, the time stamp (_time) will be 9/14/2024. When I search the summary index in the last 24 hours, the result will be empty because it's always 24 hours behind, so I have to modify the search time to the last 2-day to see the data.
Is it a best practice to keep the timestamp as the earliest time, or do you modify the timestamp to the search time?
In my example, if I modify the timestamp to the search time, the time stamp would be 9/15/2024 1 a.m.
Please suggest. Thank you so much for your help.
It depends on what you expect your users to expect. I have use cases for summary indexes which contain statistics on X days of previous data, but my users assume that a summarized event of weekly statistics on the 15th of September would contain statistics about 8-15th September. In this case it makes sense to re-eval the _time value to the search time.
Can you share your experience (use case) where you change your timestamp to current search time?
Thank you!!
It doesn't have to be current search time. It might be the time from summarized values. A relatively good example would be tracing emails from some email systems. They tend to send multiple events during a single message pass and you have to combine all those message to have a full picture of the message sender, recipients, action taken, scan results and so on. With an ad-hoc search you'd probably have to use transaction command which doesn't play nice with bigger data sets. But you can run a summarizing search every 10 or 30 minutes that will correlate all emails processed during given time window and write that summarized info into an index. I such case you'd probably want one of the message's times (most probably an initial submission time) as a summary event's _time.
It depends on what you expect your users to expect. I have use cases for summary indexes which contain statistics on X days of previous data, but my users assume that a summarized event of weekly statistics on the 15th of September would contain statistics about 8-15th September. In this case it makes sense to re-eval the _time value to the search time.
It depends on what you are using the summary index for and what you want the timestamp to represent. There is no right way or wrong way, it is a choice you make based on your usecases for the data in the summary index