 
					
				
		
In my lab setup, I have a Heavy Forwarder hosted in AWS and an indexer at home that the HF forwards data to. 
Every now and then forwarding of data gets interrupted because my old router starts suddenly considering it as a SIP attack and starts dropping it.  During that time the queue on the HF gets full and it freaks out. 
I wanted to make sure that even during the period when then happens I don't lose the data that the HF receives via it's HTTP Event Collector so I've created a 1 GB persistent queue on the HEC input. The connection went down again but once I got it fixed I did not get the data that I know was generated during that time. Nothing in my indexer. While it was still down I had a look at the 
 SPLUNK_HOME/var/run/splunk/httpin directory and there was one file but just a short meaningless string in there.
When going through the internal logs I did notice this around the time the connection was lost:
 TcpInputProc - Stopping all listening ports. Queues blocked for more than 300 seconds
I'm sure I haven't filled up the persistent queue, so if all ports get stopped when the standard queue gets full, what is the point of the persistent queue?
Or am I doing something wrong here?
 
		
		
		
		
		
	
			
		
		
			
					
		tcpout persistent queue will solve the issue. 
If ParsingQueue is full, because tcpout queue was full(due to connection issues), splunktcpin shuts input port as splunktcpin queue is also full.
HEC clients will start receiving server is busy as parsingqueue is full.
Tcpout persistent queue will be able to support all types of inputs and prevent back-pressure to parsingqueue.
https://community.splunk.com/t5/Knowledge-Management/Splunk-Persistent-Queue/m-p/688223#M10063
 
		
		
		
		
		
	
			
		
		
			
					
		tcpout persistent queue will solve the issue. 
https://community.splunk.com/t5/Knowledge-Management/Splunk-Persistent-Queue/m-p/688223#M10063
 
					
				
		
@snigdhasaxena 
I did see those docs, it does not really explain why I'm seeing this behaviour. 
I have the [queue] maxSize = 500KB by default and the persistent queue set to 1GB, which is larger then the size of the Queue. I don't see the queueSize stanza in the documentation for inputs.conf. I'm assuming it  might be a depreciated setting but will put it in my conf anyway and test to see if it makes a difference. 
 
		
		
		
		
		
	
			
		
		
			
					
		Where are you using the persistent queue setting? It is not supported on splunktcp queues...
 
					
				
		
inputs.conf
[http://.....]
 
					
				
		
 
					
				
		
Please look for other errors around that time.
A queue blocking is just a symptom but not the cause of the issue. When setting a persistenQueue in inputs.conf (remember, it's per input), also make sure to increaes all queue sizes accordingly (general setting in server.conf). I would not suggest playing around with persistentQueues and queue sizes if you don't have much experience with it. If you have a license and thus are entitled to open cases, I'd suggest to do so. Splunk support then may actually have a look at your environment and suggest settings for your queues (or other problems you might be facing).
Skalli
Take this with a grain of salt, because I'm just here looking for details about the persistent queue myself, but it sounds like you think it is a queue where the heavy forwarder holds onto data it can't send to the indexer. I believe the point of the persistent queue is to hold streaming data (udp/tcp/hec) that the heavy forwarder isn't able to process immediately due to its queues filling up. So it's useful for when your heavy forwarder receives too much data to immediately process, because it caches that data instead of dropping it. I don't think it will work as a cache for data that the heavy forwarder is attempting to send to an indexer though. I believe the default behavior of universal/heavy forwarders is to cache data that cannot be transmitted to an indexer, but your router may be dropping the information without the heavy forwarder knowing it is being dropped.
I think what you want to do is use indexer acknowledgement.  Edit your outputs.conf and set useACK=true so that the heavy forwarder would resend data to the indexer when it doesn't receive acknowledgement that it was received.  Then I believe it would cache outbound data at the heavy forwarder until you fixed your router.
[tcpout:]
server=, , ...
useACK=true
Here's the splunk doc for indexer ack:
https://docs.splunk.com/Documentation/Splunk/7.3.0/Forwarding/Protectagainstlossofin-flightdata
 
					
				
		
Persistent queues and useACK are two different kind of configurations that have nothing do to with each other. Persistent queues get either configured in inputs.conf per input, useACK however is used for all outgoing data.
 
					
				
		
So yes, from what I can read out of the doc persistent queues will be filled up when the processing pipeline get's filled it, which is something that happens if the data is being streamed in faster that the pipeline can process it but also when the output is failing (like when there is loss of connectivity to the indexer) which is my scenario. Not 100% sure at this point. Will try to find some more info on this.
@MedralaG ,
Greetings!
I am facing similar issues with HEC persistent queue. I want to store data in the persistent queue when there is an internet outage and data can not be forwarded. Did you happen to find the root cause ? Many thanks in advance!
