Monitoring Splunk

How to troubleshoot why the search server processdown

New Member

ZN35DistributedBundleReplicationManager18triggerReplicationERKSt3mapI14SchemeHostPort3StrSt4lessIS1ESaISt4pairIKS1S2EEE23BundleReplicationReason8Interval + 89 (splunkd + 0x1D43279)
[0x0000560BEABC7BC6] ZN22DistributedPeerManager18triggerReplicationE23BundleReplicationReason8Interval + 614 (splunkd + 0x1D54BC6)
[0x0000560BE976D971] _ZN22BundleReplicatorThread4mainEv + 1041 (splunkd + 0x8FA971)
[0x0000560BEA29529F] _ZN6Thread8callMainEPv + 111 (splunkd + 0x142229F)
[0x00007F063BB77E25] ? (libpthread.so.0 + 0x7E25)
[0x00007F063B8A534D] clone + 109 (libc.so.6 + 0xF834D)
Linux / SPLK
SR01 / 3.10.0-693.el7.x8664 / #1 SMP Tue Aug 22 21:09:27 UTC 2017 / x8664
Last few lines of stderr (may contain info on assertion failure, but also could be old):
2019-06-24 20:45:45.258 +0900 splunkd started (build 8f009a3f5353)
2019-06-25 12:10:02.614 +0900 Interrupt signal received
2019-06-25 12:10:20.229 +0900 splunkd started (build 8f009a3f5353)
2019-06-25 18:29:36.012 +0900 Interrupt signal received
2019-06-25 18:29:54.622 +0900 splunkd started (build 8f009a3f5353)
2019-09-09 14:28:39.245 +0900 Interrupt signal received
2019-09-09 14:28:57.862 +0900 splunkd started (build 8f009a3f5353)
2019-09-09 16:45:40.689 +0900 Interrupt signal received
2019-09-09 16:45:58.322 +0900 splunkd started (build 8f009a3f5353)
terminate called after throwing an instance of 'ThreadException'
what(): BundleReplicatorThread: about to throw a ThreadException: pthreadcreate: Resource temporarily unavailable; 32459 threads active
/etc/redhat-release: CentOS Linux release 7.4.1708 (Core)
glibc version: 2.17
glibc release: stable
Last errno: 12
Threads running: 32459
Runtime: 23463.506503s
argv: [splunkd -p 8089 restart]
Regex JIT enabled
using CLOCK
MONOTONIC
Thread: "BundleReplicatorThread", didjoin=0, readytorun=Y, mainthread=N
First 8 bytes of Thread token @0x7f060ba4a6d0:
00000000 00 f7 1f 0b 06 7f 00 00 |........|
00000008
x86 CPUID registers:
0: 00000016 756E6547 6C65746E 49656E69
1: 00050654 3E200800 7FFEFBFF BFEBFBFF
2: 76036301 00F0B5FF 00000000 00C30000
3: 00000000 00000000 00000000 00000000
4: 00000000 00000000 00000000 00000000
5: 00000040 00000040 00000003 00002020
6: 00000077 00000002 00000009 00000000
7: 00000000 00000000 00000000 00000000
8: 00000000 00000000 00000000 00000000
9: 00000000 00000000 00000000 00000000
A: 07300404 00000000 00000000 00000603
B: 00000000 00000000 000000AD 0000003E
C: 00000000 00000000 00000000 00000000
😧 00000000 00000000 00000000 00000000
E: 00000000 00000000 00000000 00000000
F: 00000000 00000000 00000000 00000000
10: 00000000 00000000 00000000 00000000
11: 00000000 00000000 00000000 00000000
12: 00000000 00000000 00000000 00000000
13: 00000000 00000000 00000000 00000000
14: 00000000 00000000 00000000 00000000
15: 00000002 000000A8 00000000 00000000
16: 00000834 00000E74 00000064 00000000
80000000: 80000008 00000000 00000000 00000000
80000001: 00000000 00000000 00000121 2C100800
80000002: 65746E49 2952286C 6F655820 2952286E
80000003: 6C6F4720 31362064 43203033 40205550
80000004: 312E3220 7A484730 00000000 00000000
80000005: 00000000 00000000 00000000 00000000
80000006: 00000000 00000000 01006040 00000000
80000007: 00000000 00000000 00000000 00000100
80000008: 0000302E 00000000 00000000 00000000
terminating...

When the server was down, it was found in the internal log.
Can you tell me why the server went down?

Labels (1)
Tags (2)
0 Karma

SplunkTrust
SplunkTrust

Open a support request.

---
If this reply helps you, an upvote would be appreciated.
0 Karma