Getting Data In

Any data forwarding issue using data cloning and different indexers ?

performancemoni
Path Finder

Hello,

We have a question regarding a specific use case of data forwarding, we would like to know if there is a risk with the situation.
Let's say we have two Splunk platforms, one has a set of indexers (indexer1) that stores data for a specific use case, the other platform has also its indexer (indexer2) for another use case. At some point, these 2 platforms will have to collect data from the same machines, not necessarily the same data, but it could be.

So there is a forwarder on a remote machine, let's say we have to collect the same file and forward it to both platforms. We are using the data cloning technique, with for example:

In inputs.conf

[monitor://file_path]
index = …
sourcetype = ...
_TCP_ROUTING=indexer1,indexer2

Or with outputs.conf

[tcpout]
defaultGroup=indexer1,indexer2

Or by using props/transforms to alter the routing of the events on an intermediate Heavy Forwarder, whatever (please, do tell if there is a significant difference with any of these methods for this situation).

Now the question is: In general, is there a risk that one of the two platforms will stop receiving events if the other is down, depending on the configuration of the indexers/forwarders ?

We have heard of the concept of indexer acknowledgment, we are not sure if it can have any impact on this situation. For example, if the group indexer1 is configured with acknowledgement enabled, is there any risk that the group indexer2 won't receive data when indexer1 is not acknowledging the reception ?

This topic is a little bit confusing for us, we have heard claims that the data forwarding could be blocked if another platform needs to receive the data and one of the indexer is down, but it doesn't seem right. We just want to clarify, with the set up described above, if there would be any issue.

Thank you very much for your help


Update:

We learned about the parameters in outputs.conf to consider when configuring the behavior of the queues:
- dropEventsOnQueueFull =
- dropClonedEventsOnQueueFull =
- blockOnCloning =

It is indeed possible to block the data collect of a splunk instance (in a data cloning configuration) when not paying attention to these parameters. In the little test we did, the default value of dropClonedEventsOnQueueFull made it so that the data collect didn't block. However we have to watch out for dropEventsOnQueueFull as well, which can cause data forwarding issues when a splunk instance is unavailable (with default value) => But it also depends on whether you accept the loss of data or not in your deployment.

Very interesting parameters to know about.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

If a forwarder cannot send data to both indexers then it will not send the data at all. It will be queued until all destinations are available. It doesn't seem right, but it is.

---
If this reply helps you, Karma would be appreciated.

performancemoni
Path Finder

Your comment is interesting, it might not answer my specific question but maybe it raises another issue that we didn't anticipated.

So wait, is this true even if the group indexer1 has acknowledgement enabled and group indexer2 has the default setting (acknowledgment disabled) ?
And you mean that if both groups are down, data forwarding will only restart when both groups are up (at least one indexer available in each group) ?

I would have thought that if the group indexer2 doesn't care about acknowledgement, data will still be forwarded to it anyway, even if the the group indexer1 is unavailable.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Giving a forwarder two outputs means the forwarder must send its data to two destinations. If either destination is blocked for any reason (no ACK, indexer is down, etc.) then the other destination is treated as though it is also blocked.

---
If this reply helps you, Karma would be appreciated.
0 Karma

performancemoni
Path Finder

Are you sure about this ? Is this behavior documented somewhere ?

I just tested the following:

  • Configured a tcpout group that refers to a non existing indexer

    [tcpout:dummy_output_group]
    server=dummy_indexer:9997

  • Configured an input sending the data to a group of existing indexer (indexer) and to the dummy_output_group

    [monitor:///splunk/etc/apps/test_app_data_forwarding/test_file.txt]
    index = main
    sourcetype = test_data_forwarding
    _TCP_ROUTING = indexer, dummy_output_group

I refreshed the configuration of the Splunk server, it took into account the new dummy output group. And the group indexer did index correctly all the events from the test file (even beyond the max queue size of the dummy output group that started to drop events).
Doesn't this test represent a case where one output group is unavailable ? Yet the data is collected by the other group.

Am I missing something ?

0 Karma

rafiki31
Engager

Hi, I've experienced both of cases.

In a lab, I observed an UF totaly stopping data forwarding through any output as soon as one of them gone down.

On the other side,under production , I also seen some UF to continue to work without any problem with a faulty configured ouput group; and no custom settings.

I thinks that a difference is made by output type. e.g. if the two groups are tcpout, so the events are considered cloned. If one of the group is a tcpout, and the other one a syslog (the case of my past lab), the events are not considered as cloned.

Just an intuition that should be verified under a Lab, but that's the better I have for now.

@splunk : a little definition of "cloned" event could be very usefull here.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

I have not seen this behavior documented anywhere. It's been passed to me by other Splunk admins. Your test does seem to contradict it, however.

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...