Knowledge Management

Will summarizing already-summarized data with sistats yield accurate results?

Communicator

Suppose I have a summary index storing summarized minute-ly data populated from sistats. Suppose each minute contains 1 million records worth of data that was summarized into 10,000 records, representing that minute. Now, I want a report showing average values for 1 week. I am running a stats search on the summary index, but it is taking forever. However If I build multiple layers of sistats, I get the results in a more timely manner. For example, sistats for 1 min, store in summary index along with custom field reportName=minute. Then sistats on "minute" report for last 60 mins into summary index along with custom field reportName=hourly. Then do the same for daily, and, finally, weekly levels. Each step takes less than 1 min.

What do I lose by layering sistats in this way? Would the final weekly report be accurate and reap sistats benefits, or would it have the same effect as layering regular stats results together? This method is more performant, but I am unsure if it is as accurate as sistat'ing once, then running stats on the summary index over the whole week (even if it does take ages).

Tags (2)
0 Karma

SplunkTrust
SplunkTrust

Whether its results are accurate or not depends on the statistics you're computing. For example, simple counts or sums would be accurate regardless of how many layers you go through.

However, I'd question that approach for simplicity reasons. Keeping these layers working together may turn into a headache down the road. Have you considered using report/datamodel acceleration on the summarized data, or even on the original data?

0 Karma