Knowledge Management

New Splunk TcpOutput persistent queue

hrawat
Splunk Employee
Splunk Employee

https://docs.splunk.com/Documentation/Splunk/9.4.0/ReleaseNotes/MeetSplunk#What.27s_New_in_9.4


Why New Splunk TcpOutput Persistent Queue?

  • Scheduled no connectivity for extended period but need to resume data transmission once connection is back up. Assuming there is enough storage, tcpout output queue can persist all events to disk instead of buying expensive third party subscription(unsupported) to persist  data to SQS/S3.
  • If there are two tcpout output destinations and one is down for extended period. Down destination has large enough PQ to persist data, then second destination is  not blocked. Second destination will block only once PQ of down destination is full.
  •  Don't have to  pay for  third party SQS & S3 puts.
  • Third party/ external S3 persistent queue introduces permanent additional latency( due to detour to external SQS/S3 queue). There are chances of loss of events( events getting in to SQS/S3 DLQ).
  • Third party/ external SQS/S3 persistent queuing requires batching events, which adds additional latency in order to reduce SQS/S3 puts cost.
  • Unwanted additional network bandwidth usage incurred due to uploading all data to SQS/S3 and then downloading .
  • Third party imposes upload payload size limits.
  • Monitored corporate laptops are off network, not connected  to internet or not connected to VPN for extended period of time. Later laptops might get switched off but events should be persisted and forwarded as and when laptop connects to network.
  • Sensitive data should stay/persisted within network.
  • On demand persistent queuing on forwarding tier when Indexer Clustering is down.
  • On demand persistent queuing on forwarding tier when Indexer Clustering indexing is slow due to high system load.
  • On demand persistent queuing on forwarding tier when Indexer Clustering is in rolling restart.
  • On demand persistent queuing on forwarding tier during Indexer Clustering upgrade.
  • Don't have to use decade old S2S protocol version as suggested by some third party vendors ( you all know enableOldS2SProtocol=true in outputs.conf)

hrawat_splunk_0-1736700631045.png



How to enable?

Just set  persistentQueueSize as per outputs.conf

[tcpout:splunk-group1]
persistentQueueSize=1TB
[tcpout:splunk-group2]
persistentQueueSize=2TB

Note: Sizing guide. Run following SPL

index=_internal source=*metrics.log* group=tcpin_connections  hostname=<all IHF> host=<all idx>| stats sum(kb) as required_pq_in_kb by hostname

 Run above search for number of days/hours/minutes depending on how much of PQ is needed. required_pq_in_kb is the size of approximate overall PQ on a given IHF.  needed. required_pq_in_kb/parallelIngestionPipelines is the approximate value for PQ for each pipeline.

persistentQueueSize = required_pq_in_kb/parallelIngestionPipelines

 
Follow best practice to configure tcpout PQ.

Labels (1)
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Deep insights, no barriers: Splunk Observability Cloud Free Edition

As software delivery cycles continue to accelerate, observability shouldn’t be a luxury — it should be a ...

Monitoring AI Agents with Splunk Observability Cloud

Let’s say I’m running a travel planning AI app in production. A user asks for three concise hotel options in ...

[Puzzles] Solve, Learn, Repeat: Tiling

This puzzle (first published here) is based on finding groups of tessellated tiles (inspired by floor tiles I ...