Hi guys,
one question.
We have a midsize Splunk environement. Data which is delivered to be ingested is increasing.
We need an architecture where we can handle our high performance data and on the other hand the normal data.
High performance data: high amount of data which needs to be ingested very fast ingested AND is under heavy search load from a specific known user group.
Are there any sugestions.
An idea is to separate data ingestion into differents streams like this :
+--------------------------------------------------------------+
| Loadbalancer (Ingress) |
+--------------------------------------------------------------+
| | |
+-----------------------+ +-----------------------+ +-----------------------+
| Forwarder Grp 1 / HEC | | Forwarder Grp 2 / HEC | |Forwarder Grp 3 / HEC |
+-----------------------+ +-----------------------+ +-----------------------+
| | |
+-----------------------+ +-----------------------+ +-----------------------+
| Indexer Cluster 1| | Indexer Cluster 2| | Indexer Cluster 3 |
| (High-Performance IDX)| | (Normal IDX) | | High-Performance IDX)|
+-----------------------+ +-----------------------+ +-----------------------+
| | |
+-----------------------+ +-----------------------+ +-----------------------+
| Search Head Cluster | | Search Head Cluster | | Search Head Cluster |
| for Power Users | | for OpenShift | | for Normal Users |
+-----------------------+ +-----------------------+ +-----------------------+
| | |
+-----------------------+ +-----------------------+ +-----------------------+
| Loadbalancer (SH1) | | Loadbalancer (SH2) | | Loadbalancer (SH3) |
+-----------------------+ +-----------------------+ +-----------------------+
is this realisable ? Are there reference architectures with detailed descriptions about the other components and config items.
Best regards from switzerland
Sascha
Hi @saschakoerner,
you can find Splunk validated architectures at https://www.splunk.com/pdfs/technical-briefs/splunk-validated-architectures.pdf but I suppose that You aren't asking for this.
For your requirements, you need a customized design of your infrastructure, and this can be done only by a Certified Splunk Architect or a Splunk PS, after a deep analysis of your infrastructure and your data: it isn't a job for an answer in Community!
Anyway, in general I don't like your idea to separate ingesting data flows by performaces.
At data level, in my opinion you should ingest all the data as soon as possible giving to the architecture all the needed resources (CPUs, RAM, performant disks for Hot and Warm data, eventual Heavy Forwarders to elaborate Parsing Phase), eventually adding more resources or nodes.
When Data are ingested, you can separate the front end putting the high performance apps in a dedicated Search Head Cluster (with the correct or redundant resources) and the other apps in one or more Search Heads, in this way you're sure that (at SH level) the needed resources are dedicated to the high performance searches.
Obviously Indexers are the same.
At the same time, you have to put some limits to the less important searches to avoid, e.g. to run real time or long period searches; you could also analyze your searches to identify the heaviest of them and eventually optimize them (e.g. using Data Models or accelarations).
Ciao.
Giuseppe