All Apps and Add-ons

Multiple HF for one Event Hub | Splunk Add-on for Microsoft Cloud Services

GaetanVP
Contributor

Hello Splunkers,

I configured my HF to pull data from an Event Hub, all good I'm receiving logs, but to much (around 130Gb/Day) and my HF often has some trouble to parse and forward the logs during "the peak of data".

I wanted to use an additional HF in order to "share" the work but I do not know how to proceed. If I configured the Add-On on this new HF the same way I did for the first, I will just end up with duplicated data...

Would you have any idea ?  

Thanks,
GaetanVP

Labels (1)
0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi @GaetanVP,

you have two solutions:

  • the one hinted by @isoutamo of using two HFs, each one with different Data Sources;
  • give more resources to your HF to be sure that it's able to manage all the Data.

They are both efficient:

  • the first has the advantage that you have also two network interfaces to use;
  • the second it's easier to implement if you are using a virtual HF.

I'd use the second monitoring if the network connection is able to accept the traffic, if yes, you solved the issue, if not, you can add a second HF.

Ciao.

Giuseppe

View solution in original post

gcusello
SplunkTrust
SplunkTrust

Hi @GaetanVP,

you have two solutions:

  • the one hinted by @isoutamo of using two HFs, each one with different Data Sources;
  • give more resources to your HF to be sure that it's able to manage all the Data.

They are both efficient:

  • the first has the advantage that you have also two network interfaces to use;
  • the second it's easier to implement if you are using a virtual HF.

I'd use the second monitoring if the network connection is able to accept the traffic, if yes, you solved the issue, if not, you can add a second HF.

Ciao.

Giuseppe

GaetanVP
Contributor

Hello @gcusello thanks for your answer, it makes sense.

However do we agree that both solution will still represent a single point of failure. If the HF (or one of the two HF) go down, I will miss all logs (or at least some of them).

Thanks,
GaetanVP

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @GaetanVP,

it should be a Single Point of Failure if you use the HF to receive data, but you are using this HF to pull data from Cloud, so it isn't mandatory to be redundant.

You could also have a cold copy of it ready to be turned on.

Ciao.

Giuseppe

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Currently the issue with HA HF is replicate checkpoints to avoid duplicate data. As far as I know currently there is no official way to replicate checkpoints with several HFs. Of course you could use some synchronization method between master and secondary nodes to prepare failure situation, but quite probably you will get at least some events duplicated when failover happens. 

isoutamo
SplunkTrust
SplunkTrust

Hi

 haven't try this by myself, but can you use several consumer groups and then configure each HF to pull only one or some of those? https://learn.microsoft.com/en-us/azure/event-hubs/event-hubs-features

r. Ismo

Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...