Getting Data In

UF stops forwarding when splunk cloud is down

randy_moore
Path Finder

If you read the title, you are going "well of course it does", but hear me out.   (This will be a long explanation that will hopefully answer the immediate questions)...

Background:
We have some on-prem UFs that forward "everything"  to our on-prem enterprise indexers AND specific logs  to our splunk cloud instance indexer.    In case you are wondering,  the cloud instance is where our customer can look at their data without needing access to our internal systems.

Problem:
Splunk did some maintenance on our cloud instance and when they did so, forwarding  from the UFs also stopped coming into our on-prem Splunk.    I can't figure out why cloud being down would stop the forwarders from sending to enterprise.    

Checking the documentation here: https://docs.splunk.com/Documentation/Splunk/8.1.0/Forwarding/Setuploadbalancingd#Configure_universa...

It reads like the UFs should switch to the next indexers when it goes down.  But it didn't.  Instead we saw this in the internal logs when the cloud instance was taken down for maintenance 

11-25-2020 21:59:48.139 -0600 WARN TcpOutputProc - The TCP output processor has paused the data flow. Forwarding to output group splunkcloud has been blocked for 1200 seconds. This will probably stall the data flow towards indexing and other network outputs. Review the receiving system's health in the Splunk Monitoring Console. It is probably not accepting data.
 
Looking at the inputs.conf and outputs.conf,  I can see nothing wrong with them to have the data blocked from these UFs

Sanitized inputs.conf, with the log that gets sent to both the on-prem instance  (PP_indexers) and cloud instance  bolded 

[monitor://C:\blahblahblah\q2.log]
_TCP_ROUTING = pp_indexers
index = fsd
sourcetype = q2

[monitor://C:\blahblahblah\wrapper.log]
_TCP_ROUTING = pp_indexers
index = fsd_sandbox
sourcetype = wrapper

[monitor://C:\blahblahblah\metrics.log]
_TCP_ROUTING = pp_indexers,splunkcloud
index = fsd_sandbox
sourcetype = metrics


Sanitized outputs.conf: 

defaultGroup = pp_indexers
forceTimebasedAutoLB = true
autoLBFrequency = 15

[tcpout:pp_indexers]
server = indexer1.ip.address.here:9997, indexer2.ip.address.here:9997



[tcpout:splunkcloud]
compressed = false
disabled = false
server = our_domain_name.cloud.splunk.com:9997
sslCommonNameToCheck = our_domain_name.cloud.splunk.com
sslCertPath = $SPLUNK_HOME/etc/apps/sanitized/client.pem
sslPassword = sanitized
sslRootCAPath = $SPLUNK_HOME/etc/apps/sanitized/cacert.pem
sslVerifyServerCert = true
useACK = true

Oh and just in case you need it...
UF versions are 7.1.2 and 7.2.3
enterprise version is 7.3.4,  cloud is 7.3.

Labels (3)
Tags (2)
0 Karma
Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...