Our lightweight forwarder has experienced several crashes within the last 5 days... here's what's in the crash log.
There was a similar discussion, but the problem had to do with the archiver.
Question... Can the http thread limit be configured/increased and if so in which config file?
[build 143156] 2014-05-16 11:23:02
Received fatal signal 6 (Aborted).
Cause:
Signal sent by PID 6652 running under UID 501.
Crashing thread: HTTPDispatch
Registers:
RIP: [0x0000003EDBC328A5] gsignal + 53 (/lib64/libc.so.6)
RDI: [0x00000000000019FC]
RSI: [0x0000000000001A00]
RBP: [0x00007F3C3E471538]
RSP: [0x00007F3CDAFFE3C8]
RAX: [0x0000000000000000]
RBX: [0x0000000001960238]
RCX: [0xFFFFFFFFFFFFFFFF]
RDX: [0x0000000000000006]
R8: [0x0000000000000001]
R9: [0x616E7520796C6972]
R10: [0x0000000000000008]
R11: [0x0000000000000206]
R12: [0x00007F3C3E6EBE00]
R13: [0x00007F3CDAFFE570]
R14: [0x00007F3C3E40C268]
R15: [0x00007F3C3E40C2F8]
EFL: [0x0000000000000206]
TRAPNO: [0x0000000000000000]
ERR: [0x0000000000000000]
CSGSFS: [0x0000000000000033]
OLDMASK: [0x0000000000000000]
OS: Linux
Arch: x86-64
Backtrace:
[0x0000003EDBC328A5] gsignal + 53 (/lib64/libc.so.6)
[0x0000003EDBC34085] abort + 373 (/lib64/libc.so.6)
[0x00000000012E6508] _ZN9__gnu_cxx27__verbose_terminate_handlerEv + 200 (splunkd)
[0x00000000012E61D6] _ZN10__cxxabiv111__terminateEPFvvE + 6 (splunkd)
[0x00000000012E6203] ? (splunkd)
[0x00000000012E6303] ? (splunkd)
[0x0000000000D92E84] ? (splunkd)
[0x0000000000D94EC9] _ZN6ThreadC2EPKcz + 521 (splunkd)
[0x0000000000A574C8] _ZN24HTTPRequestHandlerThreadC1EP10HTTPServerSt4pairIP11TcpPolledFd16FdConnectionDataEP6FdDataP16HTTPServerConfig + 72 (splunkd)
[0x0000000000A577B9] _ZN10HTTPServer22executeThreadedRequestEP6FdData + 281 (splunkd)
[0x0000000000A5B79B] _ZN10HTTPServer19rawTcpDataAvailableEP11TcpPolledFd18PollableDescriptorPKcm + 2091 (splunkd)
[0x0000000000D8FE0A] _ZN12SSLRawDataFd11handleEventEv + 378 (splunkd)
[0x0000000000D90119] _ZN11SSLPolledFd11when_eventsE18PollableDescriptor + 25 (splunkd)
[0x0000000000D2AEF5] _ZN8PolledFd8do_eventEv + 69 (splunkd)
[0x0000000000D2B9DA] _ZN9EventLoop3runEv + 490 (splunkd)
[0x0000000000656BC0] _ZN18HTTPDispatchThread4mainEv + 2656 (splunkd)
[0x0000000000D941E2] _ZN6Thread8callMainEPv + 66 (splunkd)
[0x0000003EDC007851] ? (/lib64/libpthread.so.0)
[0x0000003EDBCE767D] clone + 109 (/lib64/libc.so.6)
Linux / TDCVLOG02 / 2.6.32-279.el6.x86_64 / #1 SMP Fri Jun 22 12:19:21 UTC 2012 / x86_64
Last few lines of stderr (may contain info on assertion failure, but also could be old):
2014-04-01 16:51:07.822 -0700 splunkd started (build 143156)
terminate called after throwing an instance of 'ThreadException'
what(): HTTPDispatch: about to throw a ThreadException: pthread_create: Resource temporarily unavailable; 1009 threads active
2014-05-15 14:52:42.118 -0700 splunkd started (build 143156)
terminate called after throwing an instance of 'ThreadException'
what(): HTTPDispatch: about to throw a ThreadException: pthread_create: Resource temporarily unavailable; 1009 threads active
2014-05-16 11:17:17.852 -0700 splunkd started (build 143156)
terminate called after throwing an instance of 'ThreadException'
what(): HTTPDispatch: about to throw a ThreadException: pthread_create: Resource temporarily unavailable; 1007 threads active
/etc/redhat-release: CentOS release 6.3 (Final)
glibc version: 2.12
glibc release: stable
Threads running: 1007
argv: [splunkd -p 8089 restart]
terminating...
According to your dump, your forwarder had 1007 active threads. That is a LOT. Enough that I would think there is "something wrong". I would recommend a support case to look more deeply into the issue.