Deployment Architecture

Applying Quarantine .... Removing quarantine

abhayneilam
Contributor

Hi,
I have a UF in which Splunk_TA_nix application is installed and it was working fine but suddenly it started giving these errors in splunkd.log which causes discountinuty of sending the data to the Indexers.

02-03-2015 12:05:39.119 +0100 INFO  TailingProcessor - Could not send data to output queue (parsingQueue), retrying...
02-03-2015 12:05:50.632 +0100 WARN  TcpOutputProc - Cooked connection to ip=XXXXXXXXXXX:9997 timed out
02-03-2015 12:05:56.872 +0100 INFO  ExecProcessor - Ran script: /opt/SP/apps/splunkforwarder/Splunkforwarder-5.0/etc/apps/Splunk_TA_nix/bin/ps.sh, took 74.28 milliseconds to run, 11510 bytes read
02-03-2015 12:05:57.594 +0100 INFO  ExecProcessor - Ran script: /opt/SP/apps/splunkforwarder/Splunkforwarder-5.0/etc/apps/Splunk_TA_nix/bin/cpu.sh, took 1043.8 milliseconds to run, 1003 bytes read
02-03-2015 12:06:00.636 +0100 WARN  TcpOutputFd - Connect to XXXXXX:9997 failed. Connection refused
02-03-2015 12:06:00.636 +0100 ERROR TcpOutputFd - Connection to host=XXXXXXXXXXX:9997 failed
02-03-2015 12:06:00.636 +0100 WARN  TcpOutputProc - Applying quarantine to ip=XXXXXXXX port=9997 _numberOfFailures=2

I have an outputs.conf which is as follows :

[tcpout]
defaultGroup = default-autolb-group

[tcpout:default-autolb-group]
server = AABBCCDD:9997
useACK=true
sendCookedData = true


[tcpout-server://AABBCCDD:9997]

Note : AABBCCDD is the load balancer server ip

Data is appearing in the dashboard but not in the continous manner, it is missing for last 3 hours , sometimes it is missing for last 30 mins. Please HELP !!

Cheers,

Tags (1)

MuS
SplunkTrust
SplunkTrust

Hi abhayneilam,

usually there was a change somewhere, If something suddenly stops working.
Check this load-balancer or the Server OS, because Connection refused means that the target machine actively rejected the connection.
Check any fire wall in between, also consider routing settings.
Don't forget to check if splunkd is running on the indexers....

May I ask, why are you not using the universal forwarders internal load-balancing method?

cheers, MuS

abhayneilam
Contributor

We have this configured from the past 1 year, so all fine , no issues until yesterday, suddenly I dont know, Quantine issue appears :

02-03-2015 09:47:19.246 +0100 INFO TcpOutputProc - Removing quarantine from idx=XXXXX:9997
02-03-2015 09:47:39.247 +0100 WARN TcpOutputProc - Cooked connection to ip=XXXXX:9997 timed out
02-03-2015 09:47:39.248 +0100 WARN TcpOutputProc - Cooked connection to ip=XXXXX:9997 timed out
02-03-2015 09:47:39.248 +0100 WARN TcpOutputProc - Cooked connection to ip=XXXXX:9997 timed out
02-03-2015 09:47:39.248 +0100 WARN TcpOutputProc - Applying quarantine to ip=XXXXX port=9997 _numberOfFailures=2

0 Karma

MuS
SplunkTrust
SplunkTrust

so what did change yesterday? You're obviously no longer able to connect to port 9997 on IP XXXXX

0 Karma

abhayneilam
Contributor

Nothing was changed !! that's why it is strange , suddenly it happened.

0 Karma

mmensch
Path Finder

Did you ever resolve this issue? If so, how?

Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...