I need to create some summary indexes and am wondering what the best approach would be? So far there are various searches filling a dashboard (most counts of hosts or license usage etc).. I have configured the below summary and summary search.. I was wondering how to make this as efficient as possible using known best practices?
search index=_internal source=*license_usage.log type=Usage pool=* idx=eop* earliest=-3d@d latest=@d | timechart span=1d sum(b) as Bytes | eval Bytes=(Bytes/1024/1024/1024) | eval time=_time | fields _time Bytes | collect index=main sourcetype=summary source=eop_daily_volume_test marker="summary_type=metrics_volume, summary_span=86400, summary_method=timechart, report=\"eop daily volume test\""
We will need to define a standard for summarization that could be applied for all our summaries in the future?
I'm trying to understand the relationship between your question and the query you listed. Is that an example of the type of query you are talking about creating, or something that is prod now? There are several potential issues with the query itself.
| metasearch | eval host=lower(host) | fields host index sourcetype | rex field=sourcetype "(?<sourcetype>.+)(?:-\d+|-too_small)" | stats count by host index sourcetype. Metasearch is pretty fast; it is likely tstats would be faster /shrug. I have that search run every hour.
index=_internal source="/opt/splunk/var/log/splunk/license_usage.log" TERM(type=Usage) | stats sum(b) as bytes by idx st | rename idx as index st as sourcetype
I've mentioned 'essentially' a few times. In my environment we have data from individual units going into their own indexes and roughly split the indexes into 2 over arching groups. In order to facilitate the sharing of meta data I have 2 separate queries for items 6 & 7 as they populate separate summary indexes. I've created a CSV that maps my index list to several categories, string names for units, etc. One of those fields basically has binary t/f whether it belongs to group1. The query actually looks more like this for item 7 where I'm using a subsearch to bring back all of the indexes related to group 1:
index=_internal source="/opt/splunk/var/log/splunk/license_usage.log" TERM(type=Usage) [ inputlookup index_list | search group1=t reportable=t | rename index as idx | fields idx] | stats sum(b) as bytes by idx st | rename idx as index st as sourcetype