Installation

Why is heavy forwarder not sending data to cloud post certificate upgrade?

MuhdMurad
Loves-to-Learn Lots

Hello Team,

Few of our HF was configured to sent logs to syslog ng - local server for logs storage. After upgrade the certification on those forwarders, logs stop coming into Splunk. Its working fine on forwarders that not configured to sending data to syslog ng. 

We tried to remove the syslog ng config from the HF setting but still no data coming in.

Any idea/thought on this? Maybe anyone had similar issue previously.  Is there any cert upgrade needed on the syslog ng server as well?

Thank in advance. 

Muhammad Murad

Labels (1)
Tags (2)
0 Karma

MuhdMurad
Loves-to-Learn Lots

Thank you everyone. The issue resolve after i removed the syslog ng configuration under local directory. Means all forwarders that have syslog ng/additional output.conf will having issue after certificate upgrade. Removed the config, logs are flowing well. What we are going to do now is considering using DDSS instead of syslog ng output. 

0 Karma

PickleRick
Ultra Champion

You typicaly don't use certs with simple syslog. So it's kinda confusing where the certs and encryption are used in your setup in the first place.

Check the usual suspects. Do a btool dump of the outputs config. Verify the network configuration. Do a tcpdump and see if anh packets are being sent. Do a tcpdump on the receiving server...

0 Karma

MuhdMurad
Loves-to-Learn Lots

Agreed. That a reason we a little bit confusing why only forwarders that had a syslog ng config having issue after the cert upgrade. Others which dont have syslog ng setting was working fine. 

We did checked the network communication, everything's looks good. Connection successful - established. 

Will check on btool dump of the outputs config and tcpdump on both server to see any logs been sent/receive. 

0 Karma

PickleRick
Ultra Champion

That's why I'd expect that is some coincidence with a completely different problem. Maybe someone tried to do some config change but didn't restart the splunkd process and now the change was finally applied along your cert change?

0 Karma

MuhdMurad
Loves-to-Learn Lots

Potentially, but only 3 forwarders having these issue and only these 3 having the output.conf setting to forward data to syslog ng. all these 3 was under our team and very likely someone doing changes without our notification. 

Tried to remove the config just to ensure also not solve the issue. Hence need any insight here. 

Thank you.

0 Karma

PickleRick
Ultra Champion

OK. I re-read your initial question and I'm a bit confused. From what I understand you have some forwarders. They send (or at least are supposed to) send the events to your indexer(s). Three of them also have another output defined - a syslog one sending the events to your syslog-ng server.

And now what happened? You "upgraded the certification". What does that mean? Does it mean that your HF->idx connection used to be unencrypted and now you configured encryption or did you simply renew your certificates? And how broad was the change? Across all your forwarders? Did you change anything on the input side? (like configuring encryption or renewing the certificates if the traffic was already encrypted)

And what does work now and what doesn't? Because it's not very obvious - do the inputs work? Which outputs don't work? What do you have on those HFs in splunkd.log?

0 Karma

MuhdMurad
Loves-to-Learn Lots

its happen after i renew the certificates as suggested by Splunk. All our forwarders need to download and renew the certificates in our forwarders. Other forwarders no issue, only 3 forwarders that had additional outputs.conf that sending to syslog ng server having issue. 

Means once we renew the certificates on those forwarders (that had additional outputs.conf that sending to syslog ng), logs stop sending to Splunk Cloud. 

No any changes made in input side. I believe from the UF side are sending data to HF, but HF not forward data to Splunk. Currently we re route the logs to others forwarders that not had syslog ng config, and we can see logs are coming as normal. 

I am also a little bit confuse since we did check and confirm the connection between HF and syslog ng server are established. Its all happen right after renew the certificate. 

Is there any additional config might needed for forwarders that having additional outputs.conf to syslog after renew the cert?

Internal logs are coming normally from HF to Cloud. 

 

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Hi

You probably have had some special configuration on your old outputs.conf which have handled that two way routing which you have set up manually? Have you just installed SC's certificate app or have you also updated your local outputs.conf to use those new certs?

Best way to check which outputs.conf settings are in use is to use

splunk btool outputs list --debug

That told to you what configurations (like certs) are in use and from which files those are defined.

r. Ismo 

0 Karma

MuhdMurad
Loves-to-Learn Lots

Hello, 

Thanks. The syslog ng config was configured in outputs.conf under local as suggested by Splunk last time. We just renew the certificate, not doing anything on outputs.conf to use those new certs. 

How/what need to update in outpts.conf to use new certs? Any links explained these? i can try cause we did not touch anything in the config. we just renew the certificates and issues happen.  

Below are the btool result :

[splunk@ip-10-125-17-91 bin]$ /opt/splunk/bin/splunk btool outputs list --debug

/opt/splunk/etc/system/local/outputs.conf                       [indexAndForward]

/opt/splunk/etc/system/local/outputs.conf                       index = false

/opt/splunk/etc/system/default/outputs.conf                     [syslog]

/opt/splunk/etc/system/default/outputs.conf                     maxEventSize = 1024

/opt/splunk/etc/system/default/outputs.conf                     priority = <13>

/opt/splunk/etc/system/default/outputs.conf                     type = udp

/opt/splunk/etc/system/local/outputs.conf                       [syslog:kr_syslogng_group]

/opt/splunk/etc/system/local/outputs.conf                       server = 10.126.137.234:514

/opt/splunk/etc/system/local/outputs.conf                       type = tcp

/opt/splunk/etc/apps/100_amway_splunkcloud/local/outputs.conf   [tcpout]

/opt/splunk/etc/system/default/outputs.conf                     ackTimeoutOnShutdown = 30

/opt/splunk/etc/system/default/outputs.conf                     autoLBFrequency = 30

/opt/splunk/etc/system/default/outputs.conf                     autoLBVolume = 0

/opt/splunk/etc/system/default/outputs.conf                     blockOnCloning = true

/opt/splunk/etc/system/default/outputs.conf                     blockWarnThreshold = 100

/opt/splunk/etc/apps/100_amway_splunkcloud/local/outputs.conf   channelReapInterval = 60000

/opt/splunk/etc/apps/100_amway_splunkcloud/local/outputs.conf   channelReapLowater = 10

/opt/splunk/etc/apps/100_amway_splunkcloud/local/outputs.conf   channelTTL = 300000

/opt/splunk/etc/system/default/outputs.conf                     cipherSuite = ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:AES256-GCM-SHA384:AES128-GCM-SHA256:AES128-SHA256:ECDH-ECDSA-AES256-GCM-SHA384:ECDH-ECDSA-AES128-GCM-SHA256:ECDH-ECDSA-AES256-SHA384:ECDH-ECDSA-AES128-SHA256

/opt/splunk/etc/system/default/outputs.conf                     compressed = false

/opt/splunk/etc/system/default/outputs.conf                     connectionTTL = 0

/opt/splunk/etc/system/default/outputs.conf                     connectionTimeout = 20

/opt/splunk/etc/apps/100_amway_splunkcloud/default/outputs.conf defaultGroup = splunkcloud_20220309_2a3a6bb51c7c7db014655a134c893643

/opt/splunk/etc/system/default/outputs.conf                     disabled = false

/opt/splunk/etc/apps/100_amway_splunkcloud/local/outputs.conf   dnsResolutionInterval = 300

/opt/splunk/etc/system/default/outputs.conf                     dropClonedEventsOnQueueFull = 5

/opt/splunk/etc/system/default/outputs.conf                     dropEventsOnQueueFull = -1

/opt/splunk/etc/system/default/outputs.conf                     ecdhCurves = prime256v1, secp384r1, secp521r1

/opt/splunk/etc/system/default/outputs.conf                     forceTimebasedAutoLB = false

/opt/splunk/etc/system/default/outputs.conf                     forwardedindex.0.whitelist = .*

/opt/splunk/etc/system/default/outputs.conf                     forwardedindex.1.blacklist = _.*

/opt/splunk/etc/system/default/outputs.conf                     forwardedindex.2.whitelist = (_audit|_internal|_introspection|_telemetry)

/opt/splunk/etc/system/default/outputs.conf                     forwardedindex.filter.disable = false

/opt/splunk/etc/system/default/outputs.conf                     heartbeatFrequency = 30

/opt/splunk/etc/system/local/outputs.conf                       indexAndForward = 1

/opt/splunk/etc/system/default/outputs.conf                     maxConnectionsPerIndexer = 2

/opt/splunk/etc/system/default/outputs.conf                     maxFailuresPerInterval = 2

/opt/splunk/etc/system/default/outputs.conf                     maxQueueSize = auto

/opt/splunk/etc/apps/100_amway_splunkcloud/local/outputs.conf   negotiateNewProtocol = true

/opt/splunk/etc/system/default/outputs.conf                     readTimeout = 300

/opt/splunk/etc/system/default/outputs.conf                     secsInFailureInterval = 1

/opt/splunk/etc/system/default/outputs.conf                     sendCookedData = true

/opt/splunk/etc/apps/100_amway_splunkcloud/local/outputs.conf   socksResolveDNS = false

/opt/splunk/etc/apps/100_amway_splunkcloud/local/outputs.conf   sslPassword = $7$5EfPkE9EnHQx12YOSI1Kwga9fflT5fyblj/wzzHLgOdmxoHsfAbg0VQueyWoX11ovoWt1TIaefQfIoT/kZkGLUY3nqhb6doWv9h8xg267wL4egu0QWjXKT7WTt/j7sub

/opt/splunk/etc/system/default/outputs.conf                     sslQuietShutdown = false

/opt/splunk/etc/system/default/outputs.conf                     sslVersions = tls1.2

/opt/splunk/etc/system/default/outputs.conf                     tcpSendBufSz = 0

/opt/splunk/etc/system/default/outputs.conf                     useACK = false

/opt/splunk/etc/apps/100_amway_splunkcloud/local/outputs.conf   useClientSSLCompression = true

/opt/splunk/etc/system/default/outputs.conf                     writeTimeout = 300

/opt/splunk/etc/apps/100_amway_splunkcloud/default/outputs.conf [tcpout:scs]

/opt/splunk/etc/apps/100_amway_splunkcloud/default/outputs.conf clientCert = $SPLUNK_HOME/etc/apps/100_amway_splunkcloud/default/amway_server.pem

/opt/splunk/etc/apps/100_amway_splunkcloud/default/outputs.conf compressed = true

/opt/splunk/etc/apps/100_amway_splunkcloud/default/outputs.conf disabled = 1

/opt/splunk/etc/apps/100_amway_splunkcloud/default/outputs.conf server = amway.forwarders.scs.splunk.com:9997

/opt/splunk/etc/apps/100_amway_splunkcloud/default/outputs.conf sslAltNameToCheck = *.forwarders.scs.splunk.com

/opt/splunk/etc/apps/100_amway_splunkcloud/default/outputs.conf sslVerifyServerCert = true

/opt/splunk/etc/apps/100_amway_splunkcloud/default/outputs.conf useClientSSLCompression = false

/opt/splunk/etc/apps/100_amway_splunkcloud/default/outputs.conf [tcpout:splunkcloud_20220309_2a3a6bb51c7c7db014655a134c893643]

/opt/splunk/etc/apps/100_amway_splunkcloud/default/outputs.conf clientCert = $SPLUNK_HOME/etc/apps/100_amway_splunkcloud/default/amway_server.pem

/opt/splunk/etc/apps/100_amway_splunkcloud/default/outputs.conf compressed = false

/opt/splunk/etc/apps/100_amway_splunkcloud/default/outputs.conf server = inputs1.amway.splunkcloud.com:9997, inputs2.amway.splunkcloud.com:9997, inputs3.amway.splunkcloud.com:9997, inputs4.amway.splunkcloud.com:9997, inputs5.amway.splunkcloud.com:9997, inputs6.amway.splunkcloud.com:9997, inputs7.amway.splunkcloud.com:9997, inputs8.amway.splunkcloud.com:9997, inputs9.amway.splunkcloud.com:9997, inputs10.amway.splunkcloud.com:9997, inputs11.amway.splunkcloud.com:9997, inputs12.amway.splunkcloud.com:9997, inputs13.amway.splunkcloud.com:9997, inputs14.amway.splunkcloud.com:9997, inputs15.amway.splunkcloud.com:9997

/opt/splunk/etc/apps/100_amway_splunkcloud/default/outputs.conf sslCommonNameToCheck = *.amway.splunkcloud.com

/opt/splunk/etc/apps/100_amway_splunkcloud/default/outputs.conf sslVerifyServerCert = true

/opt/splunk/etc/apps/100_amway_splunkcloud/default/outputs.conf useClientSSLCompression = true

[splunk@ip-10-125-17-91 bin]$

0 Karma

isoutamo
SplunkTrust
SplunkTrust

It seems that there are two different definitions for sending events to SC (Splunk Cloud).

  1. tcpout:scs
    1. Which is actually disabled
  2. tcpout:splunkcloud_20220309_2a3a6bb51c7c7db014655a134c893643
    1. This has defined as default output group
Are you using that default on our inputs/routing or have you something else like tcpout:scs, which is currently disabled?
 
There is also some differences between those two definition. Also some other local definitions has done like 
  • negotiateNewProtocol
  • dnsResolutionInterval
  • socksResolveDNS
  • useClientSSLCompression
  • server
Can you check that those definitions, which has local values (/opt/splunk/etc/apps/100_amway_splunkcloud/local/outputs.conf), are also same definitions/values on those UFs which are working?
 
 
 
0 Karma

PickleRick
Ultra Champion

Firstly, I'd check the splunkd.log on those HFs. And verify that the UFs are properly connecting to HFs.

Since internal logs are getting ingested properly, the connection between HFs and indexers must be working. So check the "previous" step in the event path. Furthermore, if it was only the output from HF to indexers, your syslog output should be working. If it doesn't - it highly suggests that your TLS change must have broken something "before" the HFs.

0 Karma

MuhdMurad
Loves-to-Learn Lots

Thank you. I will arrange a time to get those UF connecting to problematic HF adn get the splunkd logs. due to this issue, we routing the logs to communicate using other HF. 

TLS broken - any link explained further on this including the resolution step? I will also checking this points. 

0 Karma

PickleRick
Ultra Champion

In order to resolve the problem you must first know what it is. For now we only know that _internal logs seem to be getting ingested properly from the "problematic" HFs which means it's most probably not an HF output issue. It might be an input issue. We don't know how your inputs are configured on those HFs and how your UFs are configured.

0 Karma
Get Updates on the Splunk Community!

Happy CX Day to our Community Superheroes!

Happy 10th Birthday CX Day!What is CX Day? It’s a global celebration recognizing innovation and success in the ...

Check out This Month’s Brand new Splunk Lantern Articles

Splunk Lantern is a customer success center providing advice from Splunk experts on valuable data insights, ...

Routing Data to Different Splunk Indexes in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. The OpenTelemetry project is the second largest ...