Re: Sufficiency of Hardware Specs at indexer layer

hectorvp · ‎12-10-2020

Hello Splunkers,

We may have around 400UFs forwarding 1GB of events each UF, with total estimated daily ingestion of 400GB each day.

Our main aim is to forward these events to the customer's indexers(customer also has indexer cluster), however with the strong requirement from the customer to provide us a validation for logs we need to store it at our indexer as well. (ok with more license consumption)

We have decided to use indexer cluster with 2 indexers for us as well.

So our indexer cluster will be performing dual role (storing event + forwarding (with anonymizing names in events)) using indexAndForward configuration.

We are not allowed to use HF in between as customer sees this in a way that different version of same event (edited version for us & different edit for them) may be received. Or wont allow to send same event directly from UF to their indexer + our indexer.

Our indexer has specs as 32GB RAM, 24 vCPUs & xxTBs (RAID 10).

Every thing is in a single data centre.

We have mainframe logs requirement in future as well.

Will this suffice our need?

Some one said me that if I'm sending events from UFs directly indexers, it will open multiple queues at indexers and will hamper performance to a greater extent, is that true ???(I don't believe so)

Can one let me know how to estimate how many indexers are required based on daily ingestion capacity considering RF2 & SF2 OR RF1 & SF1??

I've attached a diagram for better understanding.

gcusello · ‎12-11-2020

Hi @hectorvp.

at first I don't understand whay you say "We are not allowed to use HF in between": a Splunk server that forwards log s to an Indexer (eventually locally indexing some or all of them) is usually called Heavy Forwarder, not Indexer!

The second thing I don't understand is what you mean with "validate": if you mean filter data before indexing, you can do this also on Indexers.

Then, "edited version for us & different edit for them": you cannot edit data, you can index a data on your HFs and send to other systems elaborated data, but in this way you haven't the original data.

Maybe the best approach is to index original data on customer Indexers and extract the "validated data" in a summary index to use for searches (in this way you haven't a twice license consuption).

Anyway, an indexer can index around150-200 GB/day, but Splunk best practices say that it's better to have a max value of 100 GB/day, especially when the Indexer has also to answer to the searches.

So, two Indexers are few to manage 400 GB/day.

About your Indexers specs, they are correct, maybe also excessive, but "melius abundare quam deficere!

Ciao.

Giuseppe

Sufficiency of Hardware Specs at indexer layer

capacity planning

heavy forwarder

indexer

indexer clustering

intermediate forwarder

Linux

Windows

workload management

Introducing Splunk Enterprise 9.2

Adoption of RUM and APM at Splunk

Routing logs with Splunk OTel Collector for Kubernetes