How to improve universal forwarder performance

stamstam · ‎01-06-2019

Hi all,
we forward about 300GB per day from a single forwarder instance to an indexer cluster.
the forwarder is on a strong machine(24 cores, 130GB RAM, ssd) and we already configured limits.conf and server.conf for "unlimited" thruput (800MB/s) and parallelPipelines=2. also we increased the size of the parsing queue and structuredParsingQueue.
Normally we do ok, but sometimes, due to some kind of a problem data is being stacked in the machine and than the forwarder can't seem to close the gap.
the problem is that the universal forwarder lists all of the files first, and only then he forwards them to the indexers. what i want is for it to bo lazy, to list files and forward them at the same time.

Is there any config field i didn't change yet that will help me, it would be great.
Also, if there is anyone from Splunk reading this, you should know that this feature will drastically improve the universal forwarder and will change our life forever.
Thanks in advance,

MuS · ‎01-06-2019

Hi stamstam,

some time ago a customer had a similar problem when they used an universal forwarder on a 2 CPU server to read over 10 millions directories - long story short: because it took the UF too long to list all directories the actual log files never got read because they hit the ignoreOlderThan options set.

The main reason for this is the universal forwarder uses a so called Breadth-first search to discover any files and directories (Technical details can be found here https://en.wikipedia.org/wiki/Breadth-first_search). An easy understandable image of the process looks like this (it is from the wikipedia page)

The cause for this behaviour was the use of a single wildcard monitor stanza on the top level directory.
The solution was to use very specific monitor stanzas instead, and the performance increased immediately and the customer never had a similar issue again.

Hope this helps ...

cheers, MuS

stamstam · ‎01-06-2019

We already increased the number of open file descriptors limit, also we use batch input and we can't have a more specific root directory.
we have many many sub-directories and more are add constantly. we can't write an input stanza for each one.
will it actually increase our performance? in the end it will list the same folders and files.

MuS · ‎01-09-2019

Okay, this sounds like a conceptual/design problem here rather than a real Splunk problem.

There are options that could help you to increase the performance of the universal forwarder, like splitting the inputs stanzas to be more specific on the path and add specific sourcetypes, index names to each stanza for example. Because using just one single input stanza for such a mount of data brings in a lot of other concerns like is all data going into the same index, if not who is doing the parsing and re-writes the index name? This can be handled by the UF in a very efficient way.

Also, by using a batch stanza you rely on the IOPs performance of the disk, because each file will be deleted after it has been read adding wait-time to the overall performance.

The current concept makes it really hard for the universal forwarder to perform the best, and by adding more and more data it will not work better.

Try to question if this is the best way for the universal forwarder to read the amount of data you produce and then change the way the data gets ingested.

cheers, MuS

richgalloway · ‎01-07-2019

Making your monitor paths as specific as possible by pushing wildcards to the end will help the forwarder spend less time looking for files and more time forwarding them.

---
If this reply helps you, Karma would be appreciated.

ddrillic · ‎01-06-2019

Right, we had similar cases when running the forwarder on HDFS with huge number of files. ulimit -n should be very high. Btw, how large is your parsing queue? - Universal Forwarder ParsingQueue KB Size

How to improve universal forwarder performance

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

New Release of Federated Search: Bringing Splunk Analytics to More of Your Data

Inside Event Intelligence: How ITSI Turns Network Alerts into Actionable Incidents

Observability Simplified: Combining User Experience, Application Performance & ...

Join the Conversation

How to improve universal forwarder performance

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

New Release of Federated Search: Bringing Splunk Analytics to More of Your Data

Inside Event Intelligence: How ITSI Turns Network Alerts into Actionable Incidents

Observability Simplified: Combining User Experience, Application Performance & ...