Getting Data In

UF stops forwarding when splunk cloud is down

randy_moore
Path Finder

If you read the title, you are going "well of course it does", but hear me out.   (This will be a long explanation that will hopefully answer the immediate questions)...

Background:
We have some on-prem UFs that forward "everything"  to our on-prem enterprise indexers AND specific logs  to our splunk cloud instance indexer.    In case you are wondering,  the cloud instance is where our customer can look at their data without needing access to our internal systems.

Problem:
Splunk did some maintenance on our cloud instance and when they did so, forwarding  from the UFs also stopped coming into our on-prem Splunk.    I can't figure out why cloud being down would stop the forwarders from sending to enterprise.    

Checking the documentation here: https://docs.splunk.com/Documentation/Splunk/8.1.0/Forwarding/Setuploadbalancingd#Configure_universa...

It reads like the UFs should switch to the next indexers when it goes down.  But it didn't.  Instead we saw this in the internal logs when the cloud instance was taken down for maintenance 

11-25-2020 21:59:48.139 -0600 WARN TcpOutputProc - The TCP output processor has paused the data flow. Forwarding to output group splunkcloud has been blocked for 1200 seconds. This will probably stall the data flow towards indexing and other network outputs. Review the receiving system's health in the Splunk Monitoring Console. It is probably not accepting data.
 
Looking at the inputs.conf and outputs.conf,  I can see nothing wrong with them to have the data blocked from these UFs

Sanitized inputs.conf, with the log that gets sent to both the on-prem instance  (PP_indexers) and cloud instance  bolded 

[monitor://C:\blahblahblah\q2.log]
_TCP_ROUTING = pp_indexers
index = fsd
sourcetype = q2

[monitor://C:\blahblahblah\wrapper.log]
_TCP_ROUTING = pp_indexers
index = fsd_sandbox
sourcetype = wrapper

[monitor://C:\blahblahblah\metrics.log]
_TCP_ROUTING = pp_indexers,splunkcloud
index = fsd_sandbox
sourcetype = metrics


Sanitized outputs.conf: 

defaultGroup = pp_indexers
forceTimebasedAutoLB = true
autoLBFrequency = 15

[tcpout:pp_indexers]
server = indexer1.ip.address.here:9997, indexer2.ip.address.here:9997



[tcpout:splunkcloud]
compressed = false
disabled = false
server = our_domain_name.cloud.splunk.com:9997
sslCommonNameToCheck = our_domain_name.cloud.splunk.com
sslCertPath = $SPLUNK_HOME/etc/apps/sanitized/client.pem
sslPassword = sanitized
sslRootCAPath = $SPLUNK_HOME/etc/apps/sanitized/cacert.pem
sslVerifyServerCert = true
useACK = true

Oh and just in case you need it...
UF versions are 7.1.2 and 7.2.3
enterprise version is 7.3.4,  cloud is 7.3.

Labels (3)
Tags (2)
0 Karma
Get Updates on the Splunk Community!

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

Get Inspired! We’ve Got Validation that Your Hard Work is Paying Off

We love our Splunk Community and want you to feel inspired by all your hard work! Eric Fusilero, our VP of ...