I'm very curious to hear how other admins are handling summary indexing with multiple indexers and search heads.
It seems like every option above is imperfect, making for many compromises. Please share your SI architecture and why you chose it.
Thanks,
jon
EDIT - I found this previous answer. It still leaves some questions though. If I want to search from the SH and collect into an index on the indexer, do I need to create a "dummy" index on the search head? Without the custom index on the SH, it won't let me schedule it. Seems a little hacky.
Forwarding the events and summaries to your indexers and turning off indexing on the Search Head is a best practice for several reasons:
Hope this helps,
Kyle
If you don't create the "dummy" index on the search head you will get this error:
Encountered the following error while trying to update: In handler 'savedsearch': Index name=your_index_here does not exist. The summary index must exist in order for a scheduled search to populate it.
The search head uses indexes.conf to build a list of indexes it can operate on. So without it listed on the search head, you'll get this error.
Putting it on the SH also fixes autocomplete so when you type index= in the search bar that index shows up.
I have the same issue, and I was looking for you to solve my problem for me. I tried to set up a search and store the sumary index on one of the search heads. I set up an index on the one SH and I use a pool for my SHs. The problem is the other SH wants to run it and seems to be doing so, it is just not saving the data.
I have 1 search head and 2 indexers (all are individual physical machines). I don't have any real indexes on my search head - everything gets forwarded to the 2 indexers. This includes Summary Indexes. So I create a summary index on my search head and both indexers, just to ensure everything works okay.
It does seem a little hacky, but it's probably the best way to handle it.
Brian
You set up the search head with search peers - Splunk handles the rest in the background.
Take a look at: http://docs.splunk.com/Documentation/Splunk/4.3/Deploy/Configuredistributedsearch
Distributed searching is completely different from distributed indexing. fk319 asked about the latter and Brian Osburn replied about the former. Distributed indexing is about multiple indexers simultaneously indexing information. Distributed searching is about searching multiple indexer nodes (any spunk instance with indexed data) simultaneously pulling indexed information back and merging the results. Search peers are indexer nodes specified for searching.
Setting up search peers merely enables you to search indexer nodes. All indexed data stored by "collect" is stored locally.
The option of using search pooling to share KO bundles can really kill performance because it copies all the KO, including the summary indexing for local copies on each search head.
Brian, how do you tell your search where your search index actually resides? (ie how do you forward your search results back to the indexers?)