We have build our own custom application which collects data from other devices, and builds a string with a Splunk friendly format.
We are considering to use the SplunkUniversalForwarder to deliver the data to our Splunk Enterprise.
My question:
If the SplunkUniversalForwarder for some reason cant reach the indexers (eg closed firewall port or lost network connection), for how long (or how much) will data be kept in the output queue?
I have found this setting in the ${SPLUNK_HOME}$/etc/system/default/server.conf:
[queue]
maxSize = 500KB
# look back time in minutes
cntr_1_lookback_time = 60s
cntr_2_lookback_time = 600s
cntr_3_lookback_time = 900s
# sampling interval is the same for all the counters of a particular queue
# and defaults to 1 sec
sampling_interval = 1s
However testing showed that more than 1 MB of data was kept in the queue when link was restored.
Can anybody show me in any direction where i can find some information on this?
Any help would be appriciated.
If a universal forwarder loses contact with its indexer(s), it will buffer events in memory until it can reach an indexer again. If the memory buffer fills up, the forwarder will write events to disk. That is why you saw more data buffered than was allowed for in memory.
Hi Rich
Thanks for your answer, that's good information, however I would like to read some more detailed information on this.
The reason is that too much disk writes could be a big issue because this system is running from a flash drive.
Therefore i would like to disable that feature.
I haven't been able to find the documentation on this, so if you could point me in the right direction it would be much appreciated.
Thanks in advance.
//Jesper S
I suggest you make your in-memory queue as large as possible to avoid writing to disk. Then consider using the dropEventsOnQueueFull attribute. You can read about it in the Admin manual (http://docs.splunk.com/Documentation/Splunk/6.3.3/Admin/Outputsconf).
Thanks again.
Please see my outputs.conf (used for testing):
[tcpout]
defaultGroup = backend
maxQueueSize = 512KB
dropEventsOnQueueFull = 0
[tcpout:backend]
server = splunk.internal:9997
[tcpout-server://splunk.internal:9997]
Did some further testing with theese settings, but the results didnt change.
Also i dont understand the part marked in bold of the documentation:
dropEventsOnQueueFull =
* If set to a positive number, wait seconds before throwing out
all new events until the output queue
has space.
* ** Setting this to -1 or 0 will cause the output queue to block when it gets full, causing further blocking up the processing chain.**
* If any target group's queue is blocked, no more data will reach any
other target group.
* Using auto load-balancing is the best way to minimize this condition,
because, in that case, multiple
receivers must be down (or jammed up)
before queue blocking can occur.
* Defaults to -1 (do not drop events).
* DO NOT SET THIS VALUE TO A POSITIVE INTEGER IF YOU ARE MONITORING FILES!
A blocked processing chain means the forwarder will not read its inputs until the output queue has space.
Did you restart the forwarder after changing outputs.conf?
So what does
Defaults to -1 (do not drop events)
mean?
Yes, i did restart the forwarder after change, before testing.
The default value of -1 means to write events to disk rather than discard them. It's the behavior you are experiencing.
BTW, if you find the documentation to be confusing or lacking in any way, submit a comment at the bottom of the on-line page. The Splunk documentation people are very responsive.
Thanks for your help