topic Forwarding and Indexing Large Files in Getting Data In

Forwarding and Indexing Large Files

rbarajas — Wed, 15 Nov 2017 00:26:55 GMT

Anyone have any experience with fowarding and indexing files larger than 200mb every minute?
I'm curious if there are any forwarding processes or indexing processes that need some tuning to keep track of new files being created by our app each minute.

Re: Forwarding and Indexing Large Files

gcusello — Wed, 15 Nov 2017 09:55:39 GMT

Hi rbarajas,
I experienced this situation, the only way to avoid problems is to have a really good infrastructure: at first a really quick storage (1200 iops), many CPUs on Indexers and a good network, large files indexing could have long indexing queues that cause slow performaces in all the system.
About network, Splunk optimize bandwidth occupation, if you haven't a great constrain you could change the network parameter and send larger packets but I usually don't do this!
As second consequence you could not have a really near time monitoring because there could be a delay in indexing: this is important for real time monitoring and acceletarions.
Bye.
Giuseppe

Re: Forwarding and Indexing Large Files

djl — Wed, 15 Nov 2017 13:46:30 GMT

Blessed by a good network and robust indexers, the universal forwarder on the endpoint is usually my bottleneck. By closely watching the local splunkd.log file you can see where the forwarder is struggling.

This post ended up being a major help in a recent major log source onboarding experience I had - https://answers.splunk.com/answers/38218/universal-forwarder-parsingqueue-kb-size.html

I also learned the lesson during that onboarding process to uncompress large log files before handing off to the UF for parsing. The UF can handle some on the fly decompression/parsing, but at a certain point it can't keep things straight... We were dealing with compressed text files that were 1-3 million lines long, not a normal type of log situation though.

Best,

David