What is an HTTP Event Collector persistent queue?

MedralaG · ‎06-20-2019

In my lab setup, I have a Heavy Forwarder hosted in AWS and an indexer at home that the HF forwards data to.
Every now and then forwarding of data gets interrupted because my old router starts suddenly considering it as a SIP attack and starts dropping it. During that time the queue on the HF gets full and it freaks out.

I wanted to make sure that even during the period when then happens I don't lose the data that the HF receives via it's HTTP Event Collector so I've created a 1 GB persistent queue on the HEC input. The connection went down again but once I got it fixed I did not get the data that I know was generated during that time. Nothing in my indexer. While it was still down I had a look at the
SPLUNK_HOME/var/run/splunk/httpin directory and there was one file but just a short meaningless string in there.

When going through the internal logs I did notice this around the time the connection was lost:

 TcpInputProc - Stopping all listening ports. Queues blocked for more than 300 seconds

I'm sure I haven't filled up the persistent queue, so if all ports get stopped when the standard queue gets full, what is the point of the persistent queue?
Or am I doing something wrong here?

hrawat · ‎05-22-2024

tcpout persistent queue will solve the issue.

If ParsingQueue is full, because tcpout queue was full(due to connection issues), splunktcpin shuts input port as splunktcpin queue is also full.

HEC clients will start receiving server is busy as parsingqueue is full.

Tcpout persistent queue will be able to support all types of inputs and prevent back-pressure to parsingqueue.

https://community.splunk.com/t5/Knowledge-Management/Splunk-Persistent-Queue/m-p/688223#M10063

hrawat · ‎05-22-2024

tcpout persistent queue will solve the issue.
https://community.splunk.com/t5/Knowledge-Management/Splunk-Persistent-Queue/m-p/688223#M10063

MedralaG · ‎07-23-2019

@snigdhasaxena
I did see those docs, it does not really explain why I'm seeing this behaviour.
I have the [queue] maxSize = 500KB by default and the persistent queue set to 1GB, which is larger then the size of the Queue. I don't see the queueSize stanza in the documentation for inputs.conf. I'm assuming it might be a depreciated setting but will put it in my conf anyway and test to see if it makes a difference.

gjanders · ‎07-23-2019

Where are you using the persistent queue setting? It is not supported on splunktcp queues...

-
Alerts for Splunk Admins, Version Control for Splunk, Decrypt2 VersionControl For SplunkCloud

MedralaG · ‎07-24-2019

inputs.conf
[http://.....]

snigdhasaxena · ‎07-23-2019

Hi @MedralaG ,

Refer below:

http://dev.splunk.com/view/event-collector/SP-CAAAE6Q
http://docs.splunk.com/Documentation/Splunk/latest/Data/Usepersistentqueues

skalliger · ‎07-23-2019

Please look for other errors around that time.

A queue blocking is just a symptom but not the cause of the issue. When setting a persistenQueue in inputs.conf (remember, it's per input), also make sure to increaes all queue sizes accordingly (general setting in server.conf). I would not suggest playing around with persistentQueues and queue sizes if you don't have much experience with it. If you have a license and thus are entitled to open cases, I'd suggest to do so. Splunk support then may actually have a look at your environment and suggest settings for your queues (or other problems you might be facing).

Skalli

scmkent · ‎07-22-2019

Take this with a grain of salt, because I'm just here looking for details about the persistent queue myself, but it sounds like you think it is a queue where the heavy forwarder holds onto data it can't send to the indexer. I believe the point of the persistent queue is to hold streaming data (udp/tcp/hec) that the heavy forwarder isn't able to process immediately due to its queues filling up. So it's useful for when your heavy forwarder receives too much data to immediately process, because it caches that data instead of dropping it. I don't think it will work as a cache for data that the heavy forwarder is attempting to send to an indexer though. I believe the default behavior of universal/heavy forwarders is to cache data that cannot be transmitted to an indexer, but your router may be dropping the information without the heavy forwarder knowing it is being dropped.

I think what you want to do is use indexer acknowledgement. Edit your outputs.conf and set useACK=true so that the heavy forwarder would resend data to the indexer when it doesn't receive acknowledgement that it was received. Then I believe it would cache outbound data at the heavy forwarder until you fixed your router.
[tcpout:]
server=, , ...
useACK=true

Here's the splunk doc for indexer ack:
https://docs.splunk.com/Documentation/Splunk/7.3.0/Forwarding/Protectagainstlossofin-flightdata

skalliger · ‎07-23-2019

Persistent queues and useACK are two different kind of configurations that have nothing do to with each other. Persistent queues get either configured in inputs.conf per input, useACK however is used for all outgoing data.

MedralaG · ‎07-23-2019

So yes, from what I can read out of the doc persistent queues will be filled up when the processing pipeline get's filled it, which is something that happens if the data is being streamed in faster that the pipeline can process it but also when the output is failing (like when there is loss of connectivity to the indexer) which is my scenario. Not 100% sure at this point. Will try to find some more info on this.

splunker102 · ‎07-18-2022

@MedralaG ,
Greetings!

I am facing similar issues with HEC persistent queue. I want to store data in the persistent queue when there is an internet outage and data can not be forwarded. Did you happen to find the root cause ? Many thanks in advance!

What is an HTTP Event Collector persistent queue?

Splunk Observability for AI

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Splunk Observability as Code: From Zero to Dashboard

Are you a member of the Splunk Community?

What is an HTTP Event Collector persistent queue?

Splunk Observability for AI

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Splunk Observability as Code: From Zero to Dashboard