Reporting

Running report from large amount of data

yuwtennis
Communicator

Hi!

When you are creating a report from millions of data, I believe using summary indexing is a good solution.

However , if you have a requirement as on demand, would this still be a solution? In my case, I need to create a report that is mixture of average,.sum ,.most frequent value ,.etc, this makea complicated.

I appreciate if someone can give me an advice.

Tags (2)

ShaneNewman
Motivator

I have several summary indexes that do this. The most important thing, should be fairly easy, is to figure out a time span. This is a saved search template I use to populate summary indexes capturing data you described above:

[savedsearchname]
    enableSched = 1
    cron_schedule = */5 * * * *
    dispatch.earliest_time = -8m@m
    dispatch.latest_time = -3m@m
    action.summary_index = 1
    action.summary_index._name = sum_index
    action.summary_index.stat_tag = statistics
    search = index=your_index sourcetype=your_sourcetype | bucket _time span=1m | sistats\stats avg(your_field) AS your_field_avg, median(your_field) AS your_field_median, mode(your_field) AS your_field_mode, count(your_field) AS your_field_count, dc(your_field) AS your_field_dc, max(your_field) AS your_field_max, min(your_field) AS your_field_min, stdev(your_field) AS your_field_stdev, var(your_field) as your_field_var by _time

You can use a macros.conf to make the search look cleaner, as I do. I just wrote it all out to show how to set up the values you need using sistats. As you can see above, this data is on (up to) an 8 minute delay from real-time. You can adjust the delay by changing the earliest_time and latest_time parameters.

Also, when getting data back out after using sistats, you will need to rerun the stats command to "reheat" the data for use.

0 Karma

ShaneNewman
Motivator

Easy enough, just use sub-searches in your search string. There is no real reason to create a temp index, you are just adding another failure point.

0 Karma

yuwtennis
Communicator

Hello ShaneNewman.

Thank you for the reply.
I did not mention but we have 3 indexes to summarize and
each has approximately

index A : 50,000,000 events (150,000 events indexed per day)
index B : 10,000,000 events (50,000 events indexed per day)
index C : 50,000 events (few events indexed per day)

and 31 summary items to calculate.
Some summary items needs to be calculated in different dimension thus we need to create search separately.

I believe I would need to create temporary summary index and then concatenate it to single daily summary index where user can use time modifier .

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...

Network to App: Observability Unlocked [May & June Series]

In today’s digital landscape, your environment is no longer confined to the data center. It spans complex ...

SPL2 Deep Dives, AppDynamics Integrations, SAML Made Simple and Much More on Splunk ...

Splunk Lantern is Splunk’s customer success center that provides practical guidance from Splunk experts on key ...