Getting Data In

Errors on OPSEC LEA Forwarder

sha1020
Explorer

Hi,

I have a heavy forwarder running the OPSEC LEA Add-on (version 3.1) and collecting logs from a Provider-1 with about 100 CMAs.

Load is rather high on the forwarder (~ 10-18) and In splunkd.log on the forwarder, there are a lot of messages like:

03-10-2016 12:05:42.812 +0100 WARN  HttpListener - Socket error from 127.0.0.1 while accessing /servicesNS/nobody/Splunk_TA_opseclea_linux22/configs/conf-opsec-entity-health/clm_xxxx: Broken pipe
03-10-2016 12:05:43.982 +0100 WARN  ConfMetrics - single_action=ACQUIRE_MUTEX took wallclock_ms=103963
[...]
03-10-2016 14:10:38.100 +0100 WARN  ConfMetrics - single_action=ACQUIRE_MUTEX took wallclock_ms=139931
03-10-2016 14:10:38.865 +0100 WARN  ConfMetrics - single_action=ACQUIRE_MUTEX took wallclock_ms=140866
03-10-2016 14:10:39.624 +0100 WARN  ConfMetrics - single_action=ACQUIRE_MUTEX took wallclock_ms=141386
03-10-2016 14:10:40.389 +0100 WARN  ConfMetrics - single_action=ACQUIRE_MUTEX took wallclock_ms=137119

These logs are repeating every second.

Can someone tell me what these warnings mean and whether they can be turned off?

Thanks a lot.

0 Karma

ryandg
Communicator
 03-10-2016 12:05:42.812 +0100 WARN  HttpListener - Socket error from 127.0.0.1 while accessing /servicesNS/nobody/Splunk_TA_opseclea_linux22/configs/conf-opsec-entity-health/clm_xxxx: Broken pipe

This indicates that you are maxing out your threads on the server.conf:

 maxThreads = <int>
     * Number of threads that can be used by active HTTP transactions.
       This can be limited to constrain resource usage.
     * If set to 0 (the default) a limit will be automatically picked
       based on estimated server capacity.
     * If set to a negative number, no limit will be enforced.
 maxSockets = <int>
     * Number of simultaneous HTTP connections that we'll accept simultaneously.
       This can be limited to constrain resource usage.
     * If set to 0 (the default) a limit will be automatically picked
       based on estimated server capacity.
     * If set to a negative number, no limit will be enforced.

The other error is indicative that a bundle being pushed to the server is taking longer than Splunk's preferred threshold.

Honestly, with 100 CMAs.. you should NOT have it all on one dedicated HF -- unless each has barely any activity in which case why do you even have 100 CMAs? In my current environment we had to load balance 14 CMAs across 3 HFs dedicated purely to Opsec, otherwise we lose massive amounts of packets and have performance issues.

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...