Sizing Splunk for ES use has many factors for performance and to consider, it’s not a one size fits all. As everything is a search, you need to have sufficient resources to cater for users and vari...
See more...
Sizing Splunk for ES use has many factors for performance and to consider, it’s not a one size fits all. As everything is a search, you need to have sufficient resources to cater for users and various aspects of the Splunk environment alongside the ES functions. We go by a rule of thumb for ES sizing 100GB Per indexer, I have seen this higher in some cases (so the amount you’re ingesting per day), so try to understand much volume of data ingest comes into you Splunk per day. We typically dedicate the ES SH on its own for large environments, the reason is when data comes in Splunk it will also be placing that data to disk, being a provider of that data, searching the data, running datamodel searches for the correlations rules and there will be dashboards. On top of this you may have other users using the ES or ad-hoc searches - so you can see many aspects to consider (CPU/RAM/IO/Network), otherwise it can become slow and you don’t want that, you need results in a timely fashion. As guide its best for minimum for 16CPU/32G RAM for indexers and SH - as you have 32CPU/32RAM you should be ok as a starting point, but that does depend on the workload. You also need to check that the disk is SSD and I/Ops is over 800 and ensure you are not sending large volumes of data per day so that your AIO AIO can’t handle all the functions - so keep a check on ingest per day. How to check for Correlation Searches resources consumption? I would start to use the monitoring console(MC) for the usage stats, it’s very comprehensive, it will show the load etc, you can see which searches are consuming memory and this will help you with some aspects of resources.the MC comes with Splunk so it should be on your AIO. - see my links below for refence. Some tips: Ensure you only ingest important data sources on boarded and they are CIM Complaint via the TA's Enable a few data models at time based on your use cases (Correlation rules you want to use) and keep monitoring via the MC checking the load overtime, this will help you keep on top of the resources. Here ‘s some further links on the topics I have mentioned that you should read. ES Performance Reference https://docs.splunk.com/Documentation/ES/7.3.1/Install/DeploymentPlanning MC Reference https://docs.splunk.com/Documentation/Splunk/9.2.1/DMC/DMCoverview Hardware Ref https://docs.splunk.com/Documentation/Splunk/latest/Capacity/Referencehardware