Getting Data In

running down indexer congestion problems

tgiles
Path Finder

Hi,

I'm running into occasional errors from one of my indexers reporting "skipped indexing of internal audit event will keep dropping events until indexer congestion is remedied. Check disk space and other issues that may cause indexer to block."

I've run the following to monitor for any high values for the queues and don't see anything really actionable during timeframes I see the messages:

index="_internal" source="*metrics.log" group="queue" earliest=-4h | timechart max(current_size) span=30m by name

Checked for any forwarders flooding my indexer and nothing was obvious. So, nothing really actionable.

According to SPL-37407, this is a known issue in 4.2.1 "most often tcpout-queue", but there's no real info on how to get it addressed. in fact, that's the only place the tcpout-queue is mentioned. So, got some questions:

  • is there a search to query the status of the tcpout-queue on indexers?
  • would adjusting the maxQueueSize in the outputs.conf on the search heads give me a bigger default queue to work with before errors start?
  • Any tips on how to troubleshoot indexer congestion issues? there's not a lot of data out there about how to handle.

Thanks!

tom

Tags (1)
0 Karma

yannK
Splunk Employee
Splunk Employee

Good advice : install the SOS app on the indexer and check the indexing performance.
If the queues are full, then this can be :

  • slow disks (index queue) or congested rotation of buckets from homePath -> coldPath -> frozen
  • heavy parsing (parsing/aggregation queues) or non optimized events
  • heavy load, the usual suspect
  • too large metadata files -> upgrade to 4.3 ASAP

And remember that at one point, you will need more than 1 indexer to scale your volume.

0 Karma

vr2312
Builder

@yannK , is it also possible for the congestion to occur due to a lot of searches targeting the indexer. We have premium apps (ITSI/ES) enabled in our environment. Could that be the case too ?

0 Karma

splunk68
Path Finder

You could try the "Splunk on Splunk" App, http://apps.splunk.com/app/748

It will provide you a good overview of what's happening on your indexer.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

We are excited to announce that the upcoming releases of Splunk Enterprise 10.2.x and Splunk Cloud Platform ...

Step into “Hunt the Insider: An Splunk ES Premier Mystery” to catch a cybercriminal ...

After a whole week of being on call, you fell asleep on your keyboard, and you hit a sequence of buttons that ...