About tbaeg

tbaeg · ‎08-30-2012

Regardless, I have setup a reservation of 12GB of RAM and 2 sockets with 2 cores. I really don't see the need for more than that. The IOPS count was a concern for me, but it makes no sense for RAM to just have a linear growth. Eventually leading to a no memory error then system failure. Any other ideas?

tbaeg · ‎08-30-2012

I have checked the list of files being monitored, and they are not outside the bounds of what I have configured. I have also setup whitelists so whatever is being monitored in the directories is checked (via whitelist). Poor performance I could understand, but that isn't the issue. The articles you have outlined show/talk about performance effects, but my issue isn't performance related at all. Real time searches and indexing is managed without a hitch. We get less than 20 MB of logs per day (for now), so I can't say Splunk performs any memory or CPU intensive tasks.

tbaeg · ‎08-27-2012

I am having an issue with Splunk 4.3.3 regarding a memory issue. For one reason or another, it seems to have a constant linear rise in memory usage. The Splunk instance is running on RHEL 5 2.6.18-238.el5 with 4 GB of RAM and 2 CPU cores. It is also worth noting it is a VM. There are no searches running, CPU usage is idle. I do not have an excessive number of logs being ingested. 21 folders are monitored for logs with the followTail option enabled to prevent re-indexing. I also have 5 Windows machines sending logs via TCP using the Universal Forwarder. They are separated into 5 different indexes and are searchable and handled fine. I have monitored the system via TOP and shows no memory allocation to cache or buffer. But for one reason or another, it just continues to rise. Eventually over the span of a few hours, OOM killer is invoked and starts killing processes to try and free up memory, but eventually all memory including swap is consumed and the system crashes. I have tried disabling ALL indexers and monitoring of files (except main). Originally I was running 4.3.2, and tried 4.3.3 to see if it would resolve the memory leaks. splunkd.log: 08-25-2012 08:10:34.142 +0000 ERROR TcpInputFd - SSL Error for fd from HOST:X.X.X.X, IP:X.X.X.X, PORT:57483 08-25-2012 08:26:45.668 +0000 ERROR TcpInputFd - SSL_ERROR_SYSCALL ret errno:32 08-25-2012 08:27:29.414 +0000 INFO PipelineComponent - MetricsManager:probeandreport() took longer than seems reasonable (4009562 milliseconds) in callbackRunnerThread. Might indicate hardware or splunk limitations. 08-25-2012 08:28:25.020 +0000 ERROR TcpInputFd - SSL Error = error:00000000:lib(0):func(0):reason(0) 08-25-2012 08:28:36.526 +0000 ERROR TcpInputFd - ACCEPT_RESULT=-1 VERIFY_RESULT=0 08-25-2012 08:28:44.876 +0000 INFO PipelineComponent - IndexProcessor:ipCallback() took longer than seems reasonable (17956 milliseconds) in callbackRunnerThread. Might indicate hardware or splunk limitations. 08-25-2012 08:29:16.211 +0000 ERROR TcpInputFd - SSL Error for fd from HOST:X.X.X.X, IP:X.X.X.X, PORT:57484 08-25-2012 08:29:33.534 +0000 INFO PipelineComponent - IndexProcessor:ipCallback() took longer than seems reasonable (14323 milliseconds) in callbackRunnerThread. Might indicate hardware or splunk limitations. 08-25-2012 08:30:36.464 +0000 INFO PipelineComponent - IndexProcessor:ipCallback() took longer than seems reasonable (20315 milliseconds) in callbackRunnerThread. Might indicate hardware or splunk limitations. 08-25-2012 08:31:26.786 +0000 INFO PipelineComponent - IndexProcessor:ipCallback() took longer than seems reasonable (17646 milliseconds) in callbackRunnerThread. Might indicate hardware or splunk limitations. 08-25-2012 08:32:16.086 +0000 INFO PipelineComponent - IndexProcessor:ipCallback() took longer than seems reasonable (12923 milliseconds) in callbackRunnerThread. Might indicate hardware or splunk limitations. 08-25-2012 08:33:36.634 +0000 INFO PipelineComponent - HTTPAuthManager:timeoutCallback() took longer than seems reasonable (22673 milliseconds) in callbackRunnerThread. Might indicate hardware or splunk limitations. 08-25-2012 08:34:05.872 +0000 INFO PipelineComponent - IndexProcessor:ipCallback() took longer than seems reasonable (13138 milliseconds) in callbackRunnerThread. Might indicate hardware or splunk limitations. 08-25-2012 08:34:28.833 +0000 INFO PipelineComponent - IndexProcessor:ipCallback() took longer than seems reasonable (10159 milliseconds) in callbackRunnerThread. Might indicate hardware or splunk limitations. 08-25-2012 08:46:57.738 +0000 ERROR TcpInputFd - SSL_ERROR_SYSCALL ret errno:32 08-25-2012 08:47:34.291 +0000 ERROR TcpInputFd - SSL Error = error:00000000:lib(0):func(0):reason(0) 08-25-2012 08:47:40.562 +0000 ERROR TcpInputFd - ACCEPT_RESULT=-1 VERIFY_RESULT=0 08-25-2012 08:48:25.859 +0000 ERROR TcpInputFd - SSL Error for fd from HOST:X.X.X.X, IP:X.X.X.X, PORT:57485 08-25-2012 08:54:42.218 +0000 FATAL ProcessRunner - Unexpected EOF from process runner child! 08-25-2012 08:56:36.406 +0000 ERROR ProcessRunner - helper process seems to have died (child killed by signal 9: Killed)! rsyslog messages: Aug 25 08:31:41 server kernel: Node 0 HighMem: empty Aug 25 08:31:41 server kernel: 1594 pagecache pages Aug 25 08:31:41 server kernel: Swap cache: add 8929081, delete 8928541, find 1593018/3379335, race 7+4812 Aug 25 08:31:41 server kernel: Free swap = 4941840kB Aug 25 08:31:41 server kernel: Total swap = 5245212kB Aug 25 08:31:41 server kernel: Free swap: 4941840kB Aug 25 08:31:41 server kernel: 1310720 pages of RAM Aug 25 08:31:41 server kernel: 299836 reserved pages Aug 25 08:31:41 server kernel: 5504 pages shared Aug 25 08:31:41 server kernel: 543 pages swap cached Aug 25 08:31:41 server kernel: Out of memory: Killed process 21471, UID 1502, (scanner). Aug 25 08:31:41 server kernel: splunkd invoked oom-killer: gfp_mask=0x200d2, order=0, oomkilladj=0 The only other process that occurs is a log rotation by auditd. Would any guest OS settings cause Splunk to have a memory leak? Any ideas on how to resolve this would be great.

Posts	3
Solutions	0
Karma Given	0
Karma Received	1
Member Since	‎05-15-2012

Online Status	Offline
Date Last Visited	‎06-05-2020 02:04 AM

Memory Leak - Splunk 4.3.3

Re: Memory Leak - Splunk 4.3.3

Re: Memory Leak - Splunk 4.3.3

Memory Leak - Splunk 4.3.3