topic Re: Summary Index best practice in Knowledge Management

Summary Index best practice

Branden — Tue, 26 Oct 2010 20:49:55 GMT

I have a dashboard that has a pull-down menu with a list of our hosts. By selecting a host, one can get a snapshot of that host's status- paging space use, web server hits, vmstat data, SAN disk stats, etc... (Note: because we're on AIX, we cannot run the *nix app, so we get our paging/vmstat data through commands sent to stdout and captured into Splunk).

Unfortunately, it takes a while to display all the results on the dashboard. When you have 20 hosts to check, it adds up fast.

I was thinking it would be a good idea to use summary indexing to speed things up a bit. I'm just not sure what the best way to structure this should be. Should I create a summary index "si-paging" (and "si-san", "si-webhits", etc...) that will capture all that information across all hosts every few minutes, then call that search when I select a host from the menu? Or should I do it from the other angle and create a summary index for each host, containing its paging/SAN/web server/etc data?

I'm thinking the latter would be better; from what I read, you can't add a search parameter when you call a saved search from a summary index. From the docs: "The search against the summary index cannot create or modify fields before the | stats command. That means I wouldn't be able to add a "host = 'xyz'". Do I have that right?

Is my logic sound here? Is it best to create a summary index for each host and generate my results in the dashboard that way?

Thanks!

Re: Summary Index best practice

araitz — Wed, 27 Oct 2010 03:18:11 GMT

Just use the default 'summary' index, and create 'markers' for each type of populating search. For example, a summary-index populating search could be:

sourcetype=your_sourcetype | stats min(cpu) as min_cpu avg(cpu) as avg_cpu max(cpu) as max_cpu by host

When you are going through the saved search workflow, set earliest to -5m, latest to +0s, schedule the search to run every 5 minutes, and check the "enable summary indexing" checkbox, add the key "marker" and the value "cpu_by_host_5m" or something similar.

Then, to build your dashboard, use the following search or something similar (assuming $host$ is a replacement intention from your drop-down):

index=summary marker=cpu_by_host_5m $host$ | timechart max(max_cpu) min(min_cpu) avg(avg_cpu)

Re: Summary Index best practice

Branden — Wed, 27 Oct 2010 03:21:15 GMT

Thank you for the response.
Quick question: should "| stats min(cpu)..." be "| sistats min(cpu)..."? If not, at what point do I need to preface things with "si". The manual mentioned that somewhere.
Thanks!

Re: Summary Index best practice

araitz — Tue, 04 Jan 2011 01:58:30 GMT

If you feel confident that you can handle all the statistical operations and field names yourself in your summary-index populating search, then you don't need sistats. Per the docs, sistats is a better choice if you are new to Splunk and new to summary indexing.