Splunk Enterprise

How can I stop indexer from crashing frequently?

MohammedTaher
Engager

Hello Splunkers

Im facing an issue with my indexer its crashing every 1-2 hours & sometimes suddenly  crashes after 10 minutes of restarting. 

Indexer specs : 
CentOS Linux 7
24 CPU RAM 
1T SSD 

Splunk Version : Splunk 8.2.1 (build ddff1c41e5cf)

 

Crash logs : 

Received fatal signal 6 (Aborted) on PID 19932.
Cause:
Signal sent by PID 19932 running under UID 1000.
Crashing thread: tailreader0
Registers:
RIP: [0x00002B87E7282277] gsignal + 55 (libc.so.6 + 0x36277)
RDI: [0x0000000000004DDC]
RSI: [0x0000000000004EF5]
RBP: [0x00002B87E73D6580]
RSP: [0x00002B880E3FF608]
RAX: [0x0000000000000000]
RBX: [0x00002B87E5F9C000]
RCX: [0xFFFFFFFFFFFFFFFF]
RDX: [0x0000000000000006]
R8: [0x0000000000000090]
R9: [0x00002B87E7800080]
R10: [0x0000000000000008]
R11: [0x0000000000000202]
R12: [0x000055EFC1BE60C8]
R13: [0x000055EFC1BE6098]
R14: [0x00002B87E5EAD8C8]
R15: [0x00002B880E88E930]
EFL: [0x0000000000000202]
TRAPNO: [0x0000000000000000]
ERR: [0x0000000000000000]
CSGSFS: [0x0000000000000033]
OLDMASK: [0x0000000000000000]

OS: Linux
Arch: x86-64

Backtrace (PIC build):
[0x00002B87E7282277] gsignal + 55 (libc.so.6 + 0x36277)
[0x00002B87E7283968] abort + 328 (libc.so.6 + 0x37968)
[0x00002B87E727B096] ? (libc.so.6 + 0x2F096)
[0x00002B87E727B142] ? (libc.so.6 + 0x2F142)
[0x000055EFBF459410] ? (splunkd + 0x131A410)
[0x000055EFBFB3B102] _ZN3WTF23quickCheckForRolledFileERK8Pathname + 210 (splunkd + 0x19FC102)
[0x000055EFBFB3B947] _ZN3WTF13loadFishStateEP11PipelineSetb + 855 (splunkd + 0x19FC947)
[0x000055EFBFB300E8] _ZN10TailReader8readFileER15WatchedTailFile + 200 (splunkd + 0x19F10E8)
[0x000055EFBFB303A0] _ZN10TailReader4readEP15WatchedTailFileP11TailWatcher + 208 (splunkd + 0x19F13A0)
[0x000055EFBFB30D32] _ZN10TailReader10handleFileEP15WatchedTailFileP11TailWatcher + 514 (splunkd + 0x19F1D32)
[0x000055EFBF91F57A] _ZN12ReaderThread4mainEv + 746 (splunkd + 0x17E057A)
[0x000055EFC07F4C47] _ZN6Thread8callMainEPv + 135 (splunkd + 0x26B5C47)
[0x00002B87E7037E25] ? (libpthread.so.0 + 0x7E25)
[0x00002B87E734ABAD] clone + 109 (libc.so.6 + 0xFEBAD)
Linux / SRV-HO-SPLUNKIDX / 3.10.0-862.11.6.el7.x86_64 / #1 SMP Tue Aug 14 21:49:04 UTC 2018 / x86_64
Libc abort message: splunkd: /opt/splunk/src/pipeline/input/WatchedTailFile.cpp:249: void WTF::assertAndDump(bool, c
onst Str&) const: Assertion `0 && "See splunkd.log for crash reason."' failed.

/etc/redhat-release: CentOS Linux release 7.5.1804 (Core)
glibc version: 2.17
glibc release: stable
Last errno: 2
Threads running: 96
Runtime: 8926.636142s
argv: [splunkd -p 8089 start]
Regex JIT enabled

RE2 regex engine enabled

using CLOCK_MONOTONIC
Thread: "tailreader0", did_join=0, ready_to_run=Y, main_thread=N, token=47863354623744
MutexByte: MutexByte-waiting={none}
ReaderThread: mode=0, queueSize=14, shutdown=N, reconfigure=N, mode=0
Reading File-WatchedTailFile-WatchedFileState: path="/opt/splunk/var/log/introspection/resource_usage.log", flags=0x1
0000EB, alive
First 144 bytes of PathnameStat @0x2b880e890828:
00000000 00 fd 00 00 00 00 00 00 2d 96 0e 08 00 00 00 00 |........-.......|
00000010 01 00 00 00 00 00 00 00 80 81 00 00 e8 03 00 00 |................|
00000020 e8 03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000030 66 48 00 00 00 00 00 00 00 10 00 00 00 00 00 00 |fH..............|
00000040 28 00 00 00 00 00 00 00 5a 49 52 64 00 00 00 00 |(.......ZIRd....|
00000050 9e 6a 2f 0b 00 00 00 00 5a 49 52 64 00 00 00 00 |.j/.....ZIRd....|
00000060 44 ae 3e 0b 00 00 00 00 5a 49 52 64 00 00 00 00 |D.>.....ZIRd....|
00000070 44 ae 3e 0b 00 00 00 00 00 00 00 00 00 00 00 00 |D.>.............|
00000080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000090
FilesystemChangeWatcher: _timeoutActive=N, _throttled=N, _waitingForNotifyCount=18
EMPTY Q: waitingForTimeout=N, noAction=N, stat=Y, immediateStat=Y, readdir=Y, notify=Y
USING INOTIFY: wds=6, score(0xFD00)=999, hasScaledTImeouts=Y
Timeout: _when = 511211.936945614, _initialInterval = 3.000
file-in: _initialized=Y, _lastCharWasNewline=Y, _lastReadHadNulls=N, _wasCrcConflict=N, _warned=N
_nullsWarned=N, _wasTooNew=N, _exists=Y, _noDebug=N
_hadExplicitSource=N, _crossedInitCrcLenBoundary=N, _classifiedAtLeastOnce=Y, _fileReplaced=Y, _readPathAfte
rRealEOF=Y
_onlyNotifiedOnce=N, _isArchive=N, _isCached=343536, _unowned=N, _deleteOnEOF=N
_overrideDeleteOnEOF=N, _doNotDeleteChildren=N, _readFromEnd=N, _readIrregardless=N
_fileCheckMethod=0, _crcSalt=<null>, _origPath=<null>
_bytesRead=25000259, _storingBytesRead=0, _initCrc=0x56aefe7f2a71345b, _seekCrc=0xa8957fe5632ae3b
_filenameCrc=0x55d3f47641cff9b5, _fallbackCrc=0x0, _lastEOFTime=1683114330.495657534948, _modTime=1683114330
.495656545355
_eofInterval=3.000, _ignoreThresh=0.000, _initCrcBytes=256, _initCrcForBatch=0x0
_pendingMetadata=<null>
_prevFd=331, _pdModels=[1 PD: [PD: flags=0x1540030, [_path] = "/opt/splunk/var/log/introspection/resource_us
age.log", [_MetaData:Index] = "_introspection", [MetaData:Source] = "source::/opt/splunk/var/log/introspection/resour
ce_usage.log", [MetaData:Host] = "host::SRV-HO-SPLUNKIDX", [MetaData:Sourcetype] = "sourcetype::splunk_resource_usage
", [_hpn] = "_hpn", [_charSet] = "UTF-8", [_conf] = "source::/opt/splunk/var/log/introspection/resource_usage.log|hos
t::SRV-HO-SPLUNKIDX|splunk_resource_usage|4982", [_channel] = "4982"]]
_rescheduleDelay=1.000, _rescheduleFresh=Y, _name=/opt/splunk/var/log/introspection/resource_usage.log, _sta
tusName=
_st=[dev=64768, ino=135173677, mode=100600, size=18534, mtime=1683114330, owner=1000, group=1000]
_toStringPrefix=state=0x0x2b880e890780, _backoff=0
_stdataInputHeaderProcessing=[]

_detectTrailingNulls=N, _detectReadingFromOffSet=Y, _readAndSkipHeader=N, _uniqueId=4982
_rawPath=$SPLUNK_HOME/var/log/introspection

 

x86 CPUID registers:
0: 00000014 756E6547 6C65746E 49656E69
1: 000406F1 1E010800 FFFA3203 0F8BFBFF
2: 76036301 00F0B5FF 00000000 00C30000
3: 00000000 00000000 00000000 00000000
4: 00000000 00000000 00000000 00000000
5: 00000000 00000000 00000000 00000000
6: 00000004 00000000 00000000 00000000
7: 00000000 00000000 00000000 00000000
8: 00000000 00000000 00000000 00000000
9: 00000000 00000000 00000000 00000000
A: 07300401 0000007F 00000000 00000000
B: 00000000 00000000 0000009D 0000001E
C: 00000000 00000000 00000000 00000000
😧 00000000 00000000 00000000 00000000
E: 00000000 00000000 00000000 00000000
F: 00000000 00000000 00000000 00000000
10: 00000000 00000000 00000000 00000000
11: 00000000 00000000 00000000 00000000
12: 00000000 00000000 00000000 00000000
13: 00000000 00000000 00000000 00000000
14: 00000000 00000000 00000000 00000000
80000000: 80000008 00000000 00000000 00000000
80000001: 00000000 00000000 00000121 2C100800
80000002: 65746E49 2952286C 6F655820 2952286E
80000003: 55504320 2D354520 30323632 20347620
80000004: 2E322040 48473031 0000007A 00000000
80000005: 00000000 00000000 00000000 00000000
80000006: 00000000 00000000 01006040 00000000
80000007: 00000000 00000000 00000000 00000100
80000008: 0000302B 00000000 00000000 00000000
terminating...

Correlate the crash with splunkd.log : 


05-03-2023 14:46:00.547 +0300 ERROR WatchedFile [20213 tailreader0] - About to assert due to: should have gotten back
a record from fishbucket: state=0x0x2b880e890780 wtf=0x0x2b880e88e800 off=25000259 initcrc=0x56aefe7f2a71345b scrc=0
xa8957fe5632ae3b fallbackcrc=0x0 last_eof_time=1683114330 reschedule_fresh=Y is_cached=343536 fd_valid=true exists=tr
ue last_char_newline=true on_block_boundary=false only_notified_once=false was_replaced=true eof_seconds=3 delay_done
key_until_close=false unowned=false always_read=false was_too_new=false name="/opt/splunk/var/log/introspection/resource_usage.log"

Labels (1)
Tags (2)
0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...