Getting Data In
Highlighted

Universal forwarder high CPU and memory usage

Explorer

I've just installed Splunk Universal Forwarder 4.2.1 on a Linux server. I've pointed it at the whole of /var/log, which amounts to 3220 files, 24 directories and 467MiB of data.

Both CPU and memory usage of the forwarder seems to be way too high. One splunkd process seems to almost continuously use 100% of one CPU, and this same process is using 525MiB(!) of memory.

I don't see anything pertinent in the splunkd logs. strace of the splunkd process shows it calling futex() and epoll_wait() a lot and not much else...

John.

Highlighted

Re: Universal forwarder high CPU and memory usage

Path Finder

Hi John,

Does the data appear in the search via WebGui?

Which Linux do you use?

The Universal Agent seems to grab all available logs and send them to the indexer. If you want to avoid processing of old entries just let the UAgent pick up what is new and comes in after starting:

[monitor:///var/log]
followTail = 1
# Determines whether to start monitoring at the beginning of a file or at the end (and then index all events that come in after that)
# If set to 1, monitoring begins at the end of the file (like tail -f).
0 Karma
Highlighted

Re: Universal forwarder high CPU and memory usage

Explorer

Data is entering the index, yes.

CentOS 5.6 x86_64.

I can certainly see if followTail helps, thanks.

0 Karma
Highlighted

Re: Universal forwarder high CPU and memory usage

Explorer

Hmm, followTail doesn't appear to have helped - still using an obscene amount of CPU and memory for a "small" log monitoring app. 😞

0 Karma
Highlighted

Re: Universal forwarder high CPU and memory usage

Path Finder

Ok, now I'm catched here ;-). I hope it's not a big thing...

I'd like to see an output of:
$SPLUNK_HOME/bin/splunk diag
.You might put it on http://www.ge.tt/

0 Karma
Highlighted

Re: Universal forwarder high CPU and memory usage

Explorer

OK, here's the diag file - I delayed a bit as I was a bit unsure what data was included in the diag file. http://www.mediafire.com/?2wkm2tvzwmkvn47

0 Karma
Highlighted

Re: Universal forwarder high CPU and memory usage

Path Finder

What I can see in a quick shot: etc/apps/learned/local/props.conf (7MB) and sourcetypes.conf (2.7MB) are full of selflearned sourcetypes. Don't know why, maybe during testing (permissions on var/log/nagios/spool/checkresults ). I can imagine this causes high CPU. I have no idea, why automatic Sourcetype Learning is enabled on a UniversalForwarder. Doesn't make sense for me. Quick help: Delete etc/apps/learned/local/* and add to etc/system/local/props.conf: LEARN_SOURCETYPE = false

0 Karma
Highlighted

Re: Universal forwarder high CPU and memory usage

Explorer

Well, I managed to fix the problem by adding a blacklist for /var/log/nagios/archives (397MiB in 3085 files - 1 new file per day) and /var/log/nagios/spool (constant creation/change of files).

Somehow seems unsatisfactory that the former needs to be excluded...

0 Karma
Highlighted

Re: Universal forwarder high CPU and memory usage

Explorer

I have several Splunk universal forwarders have similiar extreme high CPU utilitization which at the end crash the database cluster...learned/props.conf & sourcetypes only have few lines each.. below is the inputs.conf of one of UF that occasionally high CPU utlitization.

[fschange:/var/lib/mysql-cluster/config.ini]

pollPeriod = 3600

fullEvent = true

[fschange:/var/lib/mysql-cluster/config.ini.bak]

pollPeriod = 3600

fullEvent = true

[fschange:/etc/my.cnf]

pollPeriod = 3600

fullEvent = true

Is it possible to see what splunk is doing when it is using like 95% CPU?

0 Karma
Highlighted

Re: Universal forwarder high CPU and memory usage

Explorer

I would love to know the total number of files you have in the directories listed in your inputs.conf file.

I have several AIX Universal forwarder instances that are very similar data wise and identical physically. 2 instances have very high CPU usage by Splunkd. (averaging 12 to 14%)

The only difference I can find between them is the number of files in the monitored directories. The total number of files on the High CPU usage machines totals between 8 and 9 thousand. 6,500 are in one directory alone. All the other Instances have less than 2 thousand files to track with CPU utilization being very low. (averaging less than 1% CPU)

I am wondering what is the maximum number of files one can track before cpu utilization jumps above an average of 8%?

0 Karma