Getting Data In

Lots of log files, how can I reduce forwarder memory usage?

andyk
Path Finder

The forwarder is using 4.3 GB memory. I think that is insane.
OS: Windows 2008 R2
Splunk 4.2.3

The folder I am monitoring contains 11156 files in 699 folders. The total amount of log file data is 7.5 GB.

The forwarder is configured as a "full forwarder" since I need to send data to 2 different indexers.

[monitor://E:\Data\pnlog]
host = 10.41.10.13
index = main
whitelist = \.log$
disabled = 0
followTail = 1
_TCP_ROUTING = pnlogGroup

// Andreas

Tags (2)

hexx
Splunk Employee
Splunk Employee

One of the first things I would suggest is to use ignoreOlderThan in inputs.conf in order to keep splunkd from iterating through files with a modification time that has fallen behind a certain time window :

ignoreOlderThan = <time window>
  * Causes the monitored input to stop checking files for updates if their modtime has passed this threshold.
  This improves the speed of file tracking operations when monitoring directory hierarchies with large numbers of historical files (for example, when active log files are colocated with old files that are no longer being written to).
  * As a result, do not select a cutoff that could ever occur for a file you wish to index.
  Take downtime into account!  
  Suggested value: 14d , which means 2 weeks
  * A file whose modtime falls outside this time window when seen for the first time will not be indexed at all.
  * Value must be: <number><unit> (e.g., 7d is one week).  Valid units are d (days), m (minutes), and s (seconds).
  * Default: disabled.

Beyond that, you would need to use external means to further restrict the number of files that are exposed to splunkd so that it doesn't have to create and maintain a large number of objects in memory. Note that using a whitelist or blacklist in inputs.conf to exclude some files from indexing still exposes them to splunkd for evaluation, which contributes to its resource consumption.

If your directory structure and log file distribution allows it, try to define one file monitor stanza per directory that contains logs to follow (up to 50 or so is reasonable) and use recursive = false in inputs.conf to scope the tailing processor to those directories only.

This advice is of course irrelevant if you actually have live logs you want to index in all 699 directories.

UPDATE : ** After discussing this with one of our developers, it turns out that ignoreOlderThan doesn't prevent us from creating an in-memory object for every file the tailing processor sees on splunkd startup, it just sets some aside never to be queried from disk ever again. As a result, ignoreOlderThan will have a positive effect on splunkd CPU usage but most likely **not on memory usage.

rickybails
Observer

It might be over 8 years later and we're now on splunk 7.2 but we've just had a big issue with splunkd process consuming a lot of memory (1GB). Just want to let people know that setting an ignoreOlderThan config DID have a huge impact in reducing memory (reduced it to a quarter of what it was).

In our case Splunk was monitoring a directory that had over 200k files in. A cleanup job was the answer but before that could be implemented, we set ignoreOlderThan = 2d to bring the server memory down from a critical level. CPU also came down.

0 Karma

andyk
Path Finder

I have noticed a strange thing that I have described here:
http://splunk-base.splunk.com/answers/32999/clean-the-crc-database

0 Karma

hexx
Splunk Employee
Splunk Employee

I believe that SPL-42854 would only be relevant if the queues on the forwarder are full. Also, it can only account for up to 400MB or so of excess memory usage.

@andyk : there's an important question I forgot to ask : What metric are you looking at when you report that 4.3Gb memory usage figure? Is this RSS/working set size or virtual size?

0 Karma

Ayn
Legend

I have no experience of your situation, but just wanted to mention that the latest Splunk release (4.2.4) addresses an issue with very high memory usage:

Faulty TCP connection causes very high memory usage on the Windows universal forwarder. (SPL-42854) 

I've no idea whether this fixes your issue or not, but it might be worth a try.

andyk
Path Finder

The cpu usage is about the same. It has to be this input that causes the high mem usage. It is the only input that differs from other forwarders that have normal mem usage, 200-300MB.

hexx
Splunk Employee
Splunk Employee

That's disappointing. Do you know if the CPU usage of splunkd has gone down since you introduced ignoreOlderThan? Also, are there other inputs defined on this forwarder that could be responsible for the memory usage?

0 Karma

andyk
Path Finder

I added ignoreOlderThan = 3d and restarted the forwarder.
Saddly, it didn't help at all. The forwarder is now right back at 4.3 GB mem usage. I did a search and there is only 813 files in 73 directory's that are younger than 3 days.

andyk
Path Finder

Thank you for this very valuable information. I will try to implement your solutions right now! To bad Splunk support couldn't give me this information when I contacted them...

Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!