Hi!
When you are creating a report from millions of data, I believe using summary indexing is a good solution.
However , if you have a requirement as on demand, would this still be a solution? In my case, I need to create a report that is mixture of average,.sum ,.most frequent value ,.etc, this makea complicated.
I appreciate if someone can give me an advice.
I have several summary indexes that do this. The most important thing, should be fairly easy, is to figure out a time span. This is a saved search template I use to populate summary indexes capturing data you described above:
[savedsearchname]
enableSched = 1
cron_schedule = */5 * * * *
dispatch.earliest_time = -8m@m
dispatch.latest_time = -3m@m
action.summary_index = 1
action.summary_index._name = sum_index
action.summary_index.stat_tag = statistics
search = index=your_index sourcetype=your_sourcetype | bucket _time span=1m | sistats\stats avg(your_field) AS your_field_avg, median(your_field) AS your_field_median, mode(your_field) AS your_field_mode, count(your_field) AS your_field_count, dc(your_field) AS your_field_dc, max(your_field) AS your_field_max, min(your_field) AS your_field_min, stdev(your_field) AS your_field_stdev, var(your_field) as your_field_var by _time
You can use a macros.conf to make the search look cleaner, as I do. I just wrote it all out to show how to set up the values you need using sistats. As you can see above, this data is on (up to) an 8 minute delay from real-time. You can adjust the delay by changing the earliest_time
and latest_time
parameters.
Also, when getting data back out after using sistats, you will need to rerun the stats command to "reheat" the data for use.
Easy enough, just use sub-searches in your search string. There is no real reason to create a temp index, you are just adding another failure point.
Hello ShaneNewman.
Thank you for the reply.
I did not mention but we have 3 indexes to summarize and
each has approximately
index A : 50,000,000 events (150,000 events indexed per day)
index B : 10,000,000 events (50,000 events indexed per day)
index C : 50,000 events (few events indexed per day)
and 31 summary items to calculate.
Some summary items needs to be calculated in different dimension thus we need to create search separately.
I believe I would need to create temporary summary index and then concatenate it to single daily summary index where user can use time modifier .