Ill answer some of these in parts...
1、There are a lot of duplicate events in summary index,I don't know why, the summary index will have duplicate events, and I'll make sure that the original event is not repeated ,(For example, there are no duplicate events in indexes such as nginx and Apache)
SI (Summary Indexes) are populated based on scheduled searches or searches with the collect command. If you have not scheduled your searches that create the summary indexes properly, then you're going to get duplicated events.. E.g., if you have a schedule search that runs every hour searching earliest of -1d, then youre going to have huge overlapping of events. I'd check that first.
2、Sometimes, after the alarm is triggered, it doesn't necessarily write exactly to summary index (I'm sure the result is not empty)。That is to say, sometimes it doesn't write the results to summary index
This could be the way you have the alert setup, if there are no results, it wont write the events... Hard to answer this.
3、The third question is a silly question. I'd like to ask if the summary index is for the report or for the alert? In Web UI - >settings - >Searches, reports, and alert, I choose an alert, and then enable summary indexing。Then I clicked on "Alerts" on the navigation bar,
that alert has disappeared, but that alert appears in “Reports”. why ? Why does it transfer from alerts to reports when I enable the summary index?
That enable option means write to summary index... Im not sure I understand the context full of this... Alerts will age out. But this being said, a SI can be used for reporting or alerts. Some customers aggregate data into SI, then run alerts against it. Additionally, ive seen customers create reports from SI, because its much much faster..
For your use case, you can most definitely use alerts to write events out to a summary index, and then report and alert off of that. You may want to look at the SI related commands such as collect (http://docs.splunk.com/Documentation/SplunkCloud/6.6.3/SearchReference/Collect) You can use this in your base searches to populate your SI indexes..
There are some downsides to using summary indexing. Mainly being, sourcetype changes (to stash) and also that SI is reliant on your saved searches to run and populate. This means if you miss a scheduled search, the SI wont populate and you need to back fill for that missing time window. This could lead to duplicate events.
This is where Datamodels have come in.. They automatically backfill jobs, and you can aggregate data as neccesary..
Hope that helps.
... View more