Monitoring Splunk

splunkd crashes daily after update from 6.2.3 to 6.2.4 (/src/pipeline/input/Tailing.h:120: bool StatWrap::isDir() const: Assertion _valid failed)

oHable
Explorer

Splunk-Version: 6.2.4
Splunk-Build: 271043
OS: Red Hat Enterprise Linux Server release 5.11 (Tikanga)

Since upgrading from 6.2.3 to 6.2.4 my splunk installation crashes one, two times a day.
Information in crash log is always the same


[build 271043] 2015-08-06 15:25:33
Received fatal signal 6 (Aborted).
Cause:
Signal sent by PID 577 running under UID 0.
Crashing thread: MainTailingThread
Registers:
RIP: [0x0000003C3982FFC5] gsignal + 53 (/lib64/libc.so.6)
RDI: [0x0000000000000241]
RSI: [0x00000000000002B3]
RBP: [0x00002B2B8B006940]
RSP: [0x00002B2B8B0054F8]
RAX: [0x0000000000000000]
RBX: [0x00002B2B8B0055A0]
RCX: [0xFFFFFFFFFFFFFFFF]
RDX: [0x0000000000000006]
R8: [0x0000000000000080]
R9: [0x0101010101010101]
R10: [0x0000000000000008]
R11: [0x0000000000000206]
R12: [0x00007FFF13CDB939]
R13: [0x00000000015B46B5]
R14: [0x0000000000000078]
R15: [0x0000000001612A40]
EFL: [0x0000000000000206]
TRAPNO: [0x0000000000000000]
ERR: [0x0000000000000000]
CSGSFS: [0x0000000000000033]
OLDMASK: [0x0000000000000000]
OS: Linux
Arch: x86-64
Backtrace:
[0x0000003C3982FFC5] gsignal + 53 (/lib64/libc.so.6)
[0x0000003C39831A70] abort + 272 (/lib64/libc.so.6)
[0x0000003C39829466] __assert_fail + 246 (/lib64/libc.so.6)
[0x000000000099145A] ? (splunkd)
[0x000000000098D582] _ZNK11TailWatcher12setupConfigsER15WatchedTailFile + 1474 (splunkd)
[0x000000000098D692] _ZNK11TailWatcher19initializeFileStateER15WatchedTailFileRK8Pathname + 66 (splunkd)
[0x00000000009904B2] _ZN11TailWatcher11fileChangedEP16WatchedFileStateRK7Timeval + 242 (splunkd)
[0x0000000000EC2602] _ZN30FilesystemChangeInternalWorker15callFileChangedER7TimevalP16WatchedFileState + 114 (splunkd)
[0x0000000000EC3F90] _ZN30FilesystemChangeInternalWorker12when_expiredERy + 464 (splunkd)
[0x0000000000F53B2D] _ZN11TimeoutHeap18runExpiredTimeoutsER7Timeval + 301 (splunkd)
[0x0000000000EBD818] _ZN9EventLoop3runEv + 744 (splunkd)
[0x000000000098E9ED] _ZN11TailWatcher3runEv + 141 (splunkd)
[0x000000000099428A] _ZN13TailingThread4mainEv + 154 (splunkd)
[0x0000000000F5165E] _ZN6Thread8callMainEPv + 62 (splunkd)
[0x0000003C3A40683D] ? (/lib64/libpthread.so.0)
[0x0000003C398D4FCD] clone + 109 (/lib64/libc.so.6)
Linux / <hostname> / 2.6.18-406.el5 / #1 SMP Fri May 1 10:37:57 EDT 2015 / x86_64
Last few lines of stderr (may contain info on assertion failure, but also could be old):
2015-08-05 08:30:15.473 +0200 splunkd started (build 271043)
splunkd: /home/build/build-src/6.2.4/src/pipeline/input/Tailing.h:120: bool StatWrap::isDir() const: Assertion _valid failed.
2015-08-06 09:03:22.210 +0200 splunkd started (build 271043)
2015-08-06 09:10:29.839 +0200 Interrupt signal received
2015-08-06 09:10:59.105 +0200 splunkd started (build 271043)
Conf mutator lockfile has disappeared; error condition possible.
2015-08-06 09:25:23.426 +0200 splunkd started (build 271043)
2015-08-06 10:15:03.331 +0200 Interrupt signal received
2015-08-06 10:17:19.343 +0200 splunkd started (build 271043)
splunkd: /home/build/build-src/6.2.4/src/pipeline/input/Tailing.h:120: bool StatWrap::isDir() const: Assertion _valid failed.
/etc/redhat-release: Red Hat Enterprise Linux Server release 5.11 (Tikanga)
glibc version: 2.5
glibc release: stable
Last errno: 2
Threads running: 50
argv: [splunkd -p 8089 start]
Thread: "MainTailingThread", did_join=0, ready_to_run=Y, main_thread=N
First 8 bytes of Thread token @0x2b2b7e450150:
00000000 40 69 00 8b 2b 2b 00 00 |@i..++..|
00000008
First 512 bytes of Timeout object @0x2b2b8b005cf8:
00000000 10 f2 6d 01 00 00 00 00 00 00 00 00 00 00 00 00 |..m.............|
00000010 a8 5a 00 8b 2b 2b 00 00 00 00 00 00 00 00 00 00 |.Z..++..........|
00000020 00 00 00 00 00 00 00 00 4d 60 c3 55 00 00 00 00 |........M.U....|
00000030 53 3b 04 00 00 00 00 00 00 00 00 00 00 00 00 00 |S;..............|
00000040 f0 5c 00 8b 2b 2b 00 00 90 5e 00 8b 2b 2b 00 00 |.\..++...^..++..|
00000050 01 00 00 00 01 00 00 00 80 74 8d 8b 2b 2b 00 00 |.........t..++..|
00000060 80 b5 45 7f 2b 2b 00 00 c0 b6 45 7f 2b 2b 00 00 |..E.++....E.++..|
00000070 40 45 4c 8d 2b 2b 00 00 70 5d 00 8b 2b 2b 00 00 |@EL.++..p]..++..|
00000080 70 5d 00 8b 2b 2b 00 00 80 5d 00 8b 2b 2b 00 00 |p]..++...]..++..|
00000090 80 5d 00 8b 2b 2b 00 00 90 5d 00 8b 2b 2b 00 00 |.]..++...]..++..|
000000a0 90 5d 00 8b 2b 2b 00 00 a0 5d 00 8b 2b 2b 00 00 |.]..++...]..++..|
000000b0 a0 5d 00 8b 2b 2b 00 00 00 00 00 00 00 00 00 00 |.]..++..........|
000000c0 00 80 8a 8b 2b 2b 00 00 14 10 00 00 00 00 00 00 |....++..........|
000000d0 9f 0b 60 71 00 00 00 00 78 49 61 01 00 00 00 00 |..
q....xIa.....|
000000e0 58 10 45 7f 2b 2b 00 00 00 00 00 00 00 00 00 00 |X.E.++..........|
000000f0 00 00 00 00 00 00 00 00 d0 49 88 8b 2b 2b 00 00 |.........I..++..|
00000100 f0 4a 88 8b 2b 2b 00 00 80 4e 88 8b 2b 2b 00 00 |.J..++...N..++..|
00000110 15 00 00 00 00 00 00 00 00 22 45 7f 2b 2b 00 00 |........."E.++..|
00000120 64 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |d...............|
00000130 00 00 00 00 00 00 00 00 40 d9 5b 7f 2b 2b 00 00 |........@.[.++..|
00000140 f0 64 5a 7f 2b 2b 00 00 c0 da 5b 7f 2b 2b 00 00 |.dZ.++....[.++..|
00000150 03 00 00 00 00 00 00 00 80 20 46 7f 2b 2b 00 00 |......... F.++..|
00000160 c0 00 82 87 2b 2b 00 00 00 00 00 00 00 00 00 00 |....++..........|
00000170 38 c8 33 7e 2b 2b 00 00 00 22 45 7f 2b 2b 00 00 |8.3~++..."E.++..|
00000180 80 70 46 7f 2b 2b 00 00 f8 70 46 7f 2b 2b 00 00 |.pF.++...pF.++..|
00000190 00 71 46 7f 2b 2b 00 00 5d 01 00 00 80 00 00 00 |.qF.++..].......|
000001a0 80 5c 8d 8b 2b 2b 00 00 00 00 00 00 00 00 00 00 |...++..........|
000001b0 f8 5c 00 8b 2b 2b 00 00 00 00 00 00 00 00 00 00 |...++..........|
000001c0 3f 00 00 00 00 00 00 00 c0 40 46 7f 2b 2b 00 00 |?........@F.++..|
000001d0 10 00 00 00 aa aa aa aa 00 00 00 00 00 00 00 00 |................|
000001e0 00 00 00 00 2b 2b 00 00 d0 5b 00 8b 2b 2b 00 00 |....++...[..++..|
000001f0 40 69 00 8b 2b 2b 00 00 00 00 00 00 00 00 00 00 |@i..++..........|
00000200
FilesystemChangeWatcher: timeoutActive=Y, _throttled=N, _waitingForNotifyCount=1
EMPTY Q: waitingForTimeout=N, noAction=N, stat=Y, immediateStat=Y, readdir=Y, notify=Y
WatchedTailFile-WatchedFileState: path="/tmp/sh-thd-1438868516", flags=0x24023
First 144 bytes of PathnameStat @0x2b2b8d4c4588:
00000000 65 61 72 63 68 2c 20 69 6e 66 6f 3d 67 72 61 6e |earch, info=gran|
00000010 74 65 64 20 52 45 53 54 3a 20 2f 73 65 61 72 63 |ted REST: /searc|
00000020 68 2f 6a 6f 62 73 2f 64 75 6d 6d 79 5f 5f 64 75 |h/jobs/dummy
du|
00000030 6d 6d 79 5f 5f 75 62 65 72 41 67 65 6e 74 5f 5f |mmy
uberAgent_|
00000040 52 4d 44 35 36 38 63 37 38 35 35 32 38 37 32 33 |RMD568c785528723|
00000050 38 38 33 63 5f 31 34 33 38 38 36 36 38 39 39 2e |883c_1438866899.|
00000060 31 33 33 39 2f 72 65 73 75 6c 74 73 5f 70 72 65 |1339/results_pre|
00000070 76 69 65 77 5d 5b 6e 2f 61 5d 0a 30 38 2d 30 36 |view][n/a].08-06|
00000080 2d 32 30 31 35 20 31 35 3a 31 35 3a 30 30 2e 34 |-2015 15:15:00.4|
00000090
FilesystemChangeWatcher: _timeoutActive=Y, _throttled=N, _waitingForNotifyCount=1
EMPTY Q: waitingForTimeout=N, noAction=N, stat=Y, immediateStat=Y, readdir=Y, notify=Y
Timeout: _when = 2319407908278775866.5641075399597371435, _initialMsec = 7594323794225811270
file-in: _initialized=Y, _lastCharWasNewline=N, _lastReadHadNulls=N, _wasCrcConflict=N, _warned=N
_nullsWarned=N, _wasTooNew=N, _exists=N, _noDebug=N
_hadExplicitSource=N, _crossedInitCrcLenBoundary=N, _classifiedAtLeastOnce=N, _fileReplaced=N, _readPathAfterRealEOF=N
_onlyNotifiedOnce=Y, _isArchive=N, _isCached=111213, _unowned=N, _deleteOnEOF=N
_overrideDeleteOnEOF=N, _doNotDeleteChildren=N, _readFromEnd=N, _readIrregardless=N
_fileCheckMethod=0, _crcSalt=, _origPath=
_bytesRead=0, _storingBytesRead=0, _initCrc=0x0, _seekCrc=0x0
_filenameCrc=0x911e7b52647fe4ca, _fallbackCrc=0x0, _lastEOFTime=, _modTime=
_eofSeconds=3, _ignoreThresh=, _initCrcBytes=256, _initCrcForBatch=0x0
_pendingMetadata=
_prevFd=-1, _pdModels=[0 PDs]
_rescheduleDelay=1000, _rescheduleTarget=, _name=/tmp/sh-thd-1438868516, _statusName=
_st=[dev=2682230188413166899, ino=2679696914161545266, mode=6211242063, size=2679696914161545266, mtime=2679696914161545266, owner=774910258, group=841889841]
_toStringPrefix=state=0x0x2b2b8d4c4500, _backoff=0
_stdataInputHeaderProcessing=[]
_detectTrailingNulls=N, _detectReadingFromOffSet=N, _readAndSkipHeader=N, _uniqueId=2952
_rawPath=
x86 CPUID registers:
0: 0000000B 756E6547 6C65746E 49656E69
1: 00020651 00010800 02982203 0FABFBFF
2: 76036301 00F0B2FF 00000000 00CA0000
3: 00000000 00000000 00000000 00000000
4: 00000121 01C0003F 0000003F 00000000
5: 00000000 00000000 00000000 00000000
6: 00000077 00000002 00000009 00000000
7: 00000000 00000000 00000000 00000000
8: 00000000 00000000 00000000 00000000
9: 00000001 00000000 00000000 00000000
A: 07300401 0000007F 00000000 00000000
B: 00000000 00000001 00000100 00000000
80000000: 80000008 00000000 00000000 00000000
80000001: 00000000 00000000 00000001 28100800
80000002: 20202020 6E492020 286C6574 58202952
80000003: 286E6F65 43202952 45205550 36322D35
80000004: 76203739 20402032 30372E32 007A4847
80000005: 00000000 00000000 00000000 00000000
80000006: 00000000 00000000 01006040 00000000
80000007: 00000000 00000000 00000000 00000100
80000008: 00003024 00000000 00000000 00000000

terminating...

any ideas, hints or tips ?

sincerely oliver

Tags (2)
0 Karma

MuS
SplunkTrust
SplunkTrust
0 Karma

oHable
Explorer

Hi,

thanks for that information 🙂

sincerely oliver..

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!