Monitoring Splunk

Why my Splunk crashed. Find below crash log . Need help.

ankurborah
Path Finder

[build 5a7a840afcb3] 2019-02-08 05:52:35
Received fatal signal 6 (Aborted).
Cause:
Signal sent by PID 38428 running under UID 501.
Crashing thread: IndexerService
Registers:
RIP: [0x00007FE4F199D635] gsignal + 53 (libc.so.6 + 0x32635)
RDI: [0x000000000000961C]
RSI: [0x0000000000009644]
RBP: [0x00007FE4F1CF9D98]
RSP: [0x00007FE4E89FDA38]
RAX: [0x0000000000000000]
RBX: [0x00007FE4DFE2E000]
RCX: [0xFFFFFFFFFFFFFFFF]
RDX: [0x0000000000000006]
R8: [0x000000000000000A]
R9: [0x00007FE4E89FE700]
R10: [0x0000000000000008]
R11: [0x0000000000000202]
R12: [0x00007FE4DFE3D000]
R13: [0x00007FE4E89FDBF0]
R14: [0x00007FE4E89FDBE0]
R15: [0x00007FE4E89FDD20]
EFL: [0x0000000000000202]
TRAPNO: [0x0000000000000000]
ERR: [0x0000000000000000]
CSGSFS: [0x0000000000000033]
OLDMASK: [0x0000000000000000]

OS: Linux
Arch: x86-64

Backtrace (PIC build):
Linux / AUSYDSPLUNK6 / 2.6.32-431.29.2.el6.x86_64 / #1 SMP Tue Sep 9 21:36:05 UTC 2014 / x86_64
Last few lines of stderr (may contain info on assertion failure, but also could be old):
2018-12-05 08:48:24.365 +1000 Interrupt signal received
Dying on signal #15 (si_code=0), sent by PID 22929 (UID 501)
2018-12-05 08:48:42.042 +1000 splunkd started (build 00895e76d346)
2018-12-05 10:47:05.323 +1000 Interrupt signal received
Dying on signal #15 (si_code=0), sent by PID 22929 (UID 501)
2018-12-05 10:47:30.058 +1000 splunkd started (build 00895e76d346)
2018-12-13 15:17:05.613 +1000 Interrupt signal received
Dying on signal #15 (si_code=0), sent by PID 62597 (UID 0)
2018-12-13 15:18:09.431 +1000 splunkd started (build 5a7a840afcb3)
terminate called after throwing an instance of 'ThreadException'
what(): IndexerService: about to throw a ThreadException: pthread_create: Resource temporarily unavailable; 32492 threads active

/etc/redhat-release: CentOS release 6.5 (Final)
glibc version: 2.12
glibc release: stable
Last errno: 12
Threads running: 32491
Runtime: 4890866.005744s
argv: [splunkd -p 8089 start]
Regex JIT disabled due to SELinux

using CLOCK_MONOTONIC
Thread: "IndexerService", did_join=0, ready_to_run=Y, main_thread=N
First 8 bytes of Thread token @0x7fe4e9c47010:
00000000 00 e7 9f e8 e4 7f 00 00 |........|
00000008

First 512 bytes of Timeout object @0x7fe4e9c47140:
00000000 b8 91 79 f5 e4 7f 00 00 00 00 00 00 00 00 00 00 |..y.............|
00000010 f0 db 9f e8 e4 7f 00 00 00 00 00 00 00 00 00 00 |................|
00000020 00 00 00 00 00 00 00 00 dd 81 57 00 00 00 00 00 |..........W.....|
00000030 2d 06 ae 04 00 00 00 00 e2 03 00 00 00 00 00 00 |-...............|
00000040 dd 81 57 00 00 00 00 00 65 56 b7 04 00 00 00 00 |..W.....eV......|
00000050 de 03 00 00 00 00 00 00 e8 03 00 00 00 00 00 00 |................|
00000060 f0 91 79 f5 e4 7f 00 00 b8 71 c4 e9 e4 7f 00 00 |..y......q......|
00000070 0e 00 00 00 00 00 00 00 49 6e 64 65 78 65 72 53 |........IndexerS|
00000080 65 72 76 69 63 65 00 00 bf 81 66 ce 6d ee 70 3f |ervice....f.m.p?|
00000090 dd 81 57 00 00 00 00 00 3c 32 d8 04 00 00 00 00 |..W.....<2......|
000000a0 01 00 00 00 00 00 00 00 00 d6 11 7e 03 00 00 00 |...........~....|
000000b0 d2 d3 8f e6 3f 53 d2 3d 00 00 00 00 00 00 00 00 |....?S.=........|
000000c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000000d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000000e0 31 00 00 00 00 00 00 00 c0 e3 c1 e9 e4 7f 00 00 |1...............|
000000f0 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000100 00 00 00 00 00 00 00 00 50 dd 9f e8 e4 7f 00 00 |........P.......|
00000110 00 e7 9f e8 e4 7f 00 00 00 00 00 00 00 00 00 00 |................|
00000120 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000140 00 00 00 00 00 00 00 00 80 72 c4 e9 e4 7f 00 00 |.........r......|
00000150 01 00 00 00 00 00 00 00 00 e7 9f e8 e4 7f 00 00 |................|
00000160 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000180 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000190 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000001a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000001b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000001c0 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000001d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000001e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000001f0 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 |................|
00000200
_when = 5734877.078513709, _initialInterval = 0.994

x86 CPUID registers:
0: 0000000F 756E6547 6C65746E 49656E69
1: 000306F2 09080800 FEDA3203 1F8BFBFF
2: 76036301 00F0B5FF 00000000 00C10000
3: 00000000 00000000 00000000 00000000
4: 1C000000 00000000 00000000 00000000
5: 00000000 00000000 00000000 00000000
6: 00000000 00000000 00000000 00000000
7: 00000000 00000000 00000000 00000000
8: 00000000 00000000 00000000 00000000
9: 00000000 00000000 00000000 00000000
A: 00000000 00000000 00000000 00000000
B: 00000000 00000000 000000FD 00000009
C: 00000000 00000000 00000000 00000000
😧 00000000 00000000 00000000 00000000
E: 00000000 00000000 00000000 00000000
F: 00000000 00000000 00000000 00000000
80000000: 80000008 00000000 00000000 00000000
80000001: 00000000 00000000 00000021 2C100800
80000002: 65746E49 2952286C 6F655820 2952286E

Tags (3)
0 Karma

ddrillic
Ultra Champion

Runtime: 4890866.005744s and 4890866/60/60/24 = 56.6 days -- pretty good. Support is the way to go ; -)

0 Karma

chrisyounger
SplunkTrust
SplunkTrust

Hi @ankurborah

This looks to me like it's a Splunk indexer that has crashed becuase it ran out of threads. It had 32492 active threads which is a lot. You really need to raise a support ticket with Splunk becuase crashing Splunk is certainly a bug.

If you want to try and resolve it yourself, I would be to have a think about why your environment could have so many threads open:

  • are a huge amount of connections being held open by slow endpoints?
  • have you reached the scaling limits of this server?

Also consider upgrading to the latest Splunk version.

All the best

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...