They are many features using objects named "summary", this is confusing, please clarify.
what are the differences between all those paths ?
$SPLUNK_HOME/var/lib/splunk/summary/db
$SPLUNK_HOME/var/lib/splunk/defaultdb/summary
$SPLUNK_HOME/var/lib/splunk/defaultdb/datamodel_summary
In savedsearches, what means auto_summarize
and alert.action=summary
To clarify there are 3 features named "summary" in splunk, :
A - Summary indexing : classic since splunk 4.*
B - Report acceleration : introduced on splunk 5.*
C - Data model acceleration : introduced on splunk 6.*
Remark : none of those features counts on your license usage, but they can add some extra search load to generate the summarized data.
To clarify there are 3 features named "summary" in splunk, :
A - Summary indexing : classic since splunk 4.*
B - Report acceleration : introduced on splunk 5.*
C - Data model acceleration : introduced on splunk 6.*
Remark : none of those features counts on your license usage, but they can add some extra search load to generate the summarized data.
@yannK - Is there anyway you could explain these in more of a conceptual vs. a mechanical way?
For example, maybe explain if one is more like an additional index vs. one being a cache? Maybe some good cases of when to use one vs. when to use another.
Those methods A and B are not supposed to complete each others they are just 2 ways to achieve the same thing.
A - The "Summary indexing" is like generating events in a new index.
It's is perfect to generate a new set of pre-calculated data, and keep it for a longer retention.
example : having millions of web acccess logs in an index with a short retention, and every day summarize them as a number of hit per day, store in an dedicated index with a long retention. At the end you will only keep this information.
The only difficulty is if a scheduled search is skipped, you may have a gap to backfill
B - Report acceleration is for searches only, it precaculate them for you.
Example : having a long statistical search over a long period to populate a dashboard. Accelerate it to run all the time in the background , and load faster.
C - Data model acceleration is only usefull if you already have a datamodel. They are usually heavier to run, so accelerating them will help.
Example : the Common Information Model (CIM) comes with many datamodels, once the volume is large, the searches are slower. When the acceleration is turned on (depending of thebackfill range), it will be faster for the recent days.
yannK mentioned the following
an index named "summary" is shipped with splunk by default ($SPLUNK_HOME/var/lib/splunk/summary/db)
I believe it's $SPLUNK_HOME/var/lib/splunk/summarydb, not $SPLUNK_HOME/var/lib/splunk/summary/db. Notice that there is no backslash between "summary" and "db".