Hi I am creating a new environment including around 300 Linux machines and around 50 Windows servers.I will be installing Universal forwarders to forward the data to Central Indexers.
How can I calculate the numbers of Indexers to be setup ?
Is there a rule to choose the number of indexers?
DO look at search head pooling documentation. Deployment server you have to use no other option for you when you are having more than 1000 forwarders 😉
So this is my final proposed configuration :-
400 Linux machines and 100 Windows servers and 1500 user VM machines. Will be installing forwarders(more than 1K) to forward the data to 10 Central Indexers,5 search heads .Is this configuration optimal?
Should I use deployment server to distribute the configurations to all forwarders ?
How can I set the search head to manage the search peers (10 indexers),should i add them all to the 5 search heads or divide them say 2 search peers per search head ?
Any suggestion would be helpful ...:)
It is worth separating the search head once you go to 2 indexers and definitely if you are using clustering. From then on you can keep a rough 4 to 1 ratio of indexers to search heads. So for your setup, I'd recommend 2 or 3 search heads though you may cope with less if you don't have many users/scheduled searches. See the table at :-
http://docs.splunk.com/Documentation/Splunk/6.0/Deploy/Summaryofperformancerecommendations
If you are using a premium apps, such as ES, they introduce a heavy search load so will require extra resources so at least 1 extra dedicated search head.
Thank you very much,I am wondering what would be the preferred number of Search heads ? Is 1 dedicated search head, RF =3 , SF=2 good ?
The short answer is 10 indexers could cope and allow 80 to 120 concurrent searches assuming you are using splunk recommended hardware. http://docs.splunk.com/Documentation/Splunk/latest/Installation/Referencehardware
There are two basic metrics for calculating how many indexers you need. How much data you will be indexing per day and how many concurrent searches will be run.
You can't just estimate data volumes by knowing the OS. A windows AD server will generate vastly different volumes to a terminal server. Similarly an intranet web server won't get as many hits as a busy e-commerce site. So I'm afraid you will have to sample and calculate volumes. Traditionally splunk have said an indexer can handle 100GB per day though I have seen a blog stating this could be increased.
Estimating searches can also be a little difficult. As a rule of thumb, we estimate that heavy users of splunk can average up to 4 concurrent searches. For example, opening a dashboard can kick off multiple searches. Then add up how many concurrent searches are run on schedules or real time. Include scheduled reports, summarising, alerting, etc.
Each server can cope with 100GB/day and one search per CPU core (normally 8-12). So divide your measured/estimated figures by these and take whichever is bigger. Remember to round your figures up and allow a little for unexpected increases in data and usage.
The docs recommend 100GB/day and up to 8 concurrent users per search head. http://docs.splunk.com/Documentation/Splunk/6.0/Deploy/Summaryofperformancerecommendations
This is assuming high volume users. If they are only occasionally searching, 30 or more is possible. But as yannk says setting up search head pooling does require extra effort and some customers get around this by having a job search head and One for each department. This assumes they do not want to share knowledge objects.
Rule of thumb is :
- an indexer for 50GB/day of data indexed.
- for large number of users, a search-head 30 concurrent users, and a search-head for scheduled searches/alerts. Otherwise for low number do concurrent searches, a single one can be enough. the problem with multiple SH, is that you will need to setup search-head pooling, and invest in a pro NFS storage.
My data volume is around 1 TB per day...
It's mostly based on data volume. Do you have an estimate on how much log data your Splunk setup will be handling per day?