When you are creating a report from millions of data, I believe using summary indexing is a good solution.
However , if you have a requirement as on demand, would this still be a solution? In my case, I need to create a report that is mixture of average,.sum ,.most frequent value ,.etc, this makea complicated.
I appreciate if someone can give me an advice.
I have several summary indexes that do this. The most important thing, should be fairly easy, is to figure out a time span. This is a saved search template I use to populate summary indexes capturing data you described above:
[savedsearchname] enableSched = 1 cron_schedule = */5 * * * * dispatch.earliest_time = -8m@m dispatch.latest_time = -3m@m action.summary_index = 1 action.summary_index._name = sum_index action.summary_index.stat_tag = statistics search = index=your_index sourcetype=your_sourcetype | bucket _time span=1m | sistats\stats avg(your_field) AS your_field_avg, median(your_field) AS your_field_median, mode(your_field) AS your_field_mode, count(your_field) AS your_field_count, dc(your_field) AS your_field_dc, max(your_field) AS your_field_max, min(your_field) AS your_field_min, stdev(your_field) AS your_field_stdev, var(your_field) as your_field_var by _time
You can use a macros.conf to make the search look cleaner, as I do. I just wrote it all out to show how to set up the values you need using sistats. As you can see above, this data is on (up to) an 8 minute delay from real-time. You can adjust the delay by changing the
Also, when getting data back out after using sistats, you will need to rerun the stats command to "reheat" the data for use.
Thank you for the reply.
I did not mention but we have 3 indexes to summarize and
each has approximately
index A : 50,000,000 events (150,000 events indexed per day)
index B : 10,000,000 events (50,000 events indexed per day)
index C : 50,000 events (few events indexed per day)
and 31 summary items to calculate.
Some summary items needs to be calculated in different dimension thus we need to create search separately.
I believe I would need to create temporary summary index and then concatenate it to single daily summary index where user can use time modifier .
Easy enough, just use sub-searches in your search string. There is no real reason to create a temp index, you are just adding another failure point.