Hey all,
We have seen these crashes happen on two servers the past few days. Is there anything in the crash log that would help identify a root cause? I have browsed through the log but nothing jumps out at me.
See logs below...
Thx,
JB
[splunk@rtlvpxawsb splunk]$ cat crash-2016-01-31-00\:10\:01.log
[build 271043] 2016-01-31 00:10:01
Received fatal signal 6 (Aborted).
Cause:
Signal sent by PID 10897 running under UID 49436.
Crashing thread: MainTailingThread
Registers:
RIP: [0x0000003E7CA32625] gsignal + 53 (/lib64/libc.so.6)
RDI: [0x0000000000002A91]
RSI: [0x0000000000002AB0]
RBP: [0x0000000001612A40]
RSP: [0x00007FD2AA5F9278]
RAX: [0x0000000000000000]
RBX: [0x00007FD2B3ECB000]
RCX: [0xFFFFFFFFFFFFFFFF]
RDX: [0x0000000000000006]
R8: [0xFEFEFEFEFEFEFEFF]
R9: [0x00007FD2B3F61F60]
R10: [0x0000000000000008]
R11: [0x0000000000000202]
R12: [0x00000000015B46B5]
R13: [0x0000000001614660]
R14: [0x00007FD2AA5F9B30]
R15: [0x00007FD2A98A42C0]
EFL: [0x0000000000000202]
TRAPNO: [0x0000000000000000]
ERR: [0x0000000000000000]
CSGSFS: [0x0000000000000033]
OLDMASK: [0x0000000000000000]
OS: Linux
Arch: x86-64
Backtrace:
[0x0000003E7CA32625] gsignal + 53 (/lib64/libc.so.6)
[0x0000003E7CA33E05] abort + 373 (/lib64/libc.so.6)
[0x0000003E7CA2B74E] ? (/lib64/libc.so.6)
[0x0000003E7CA2B810] __assert_perror_fail + 0 (/lib64/libc.so.6)
[0x000000000099145A] ? (splunkd)
[0x000000000098D582] _ZNK11TailWatcher12setupConfigsER15WatchedTailFile + 1474 (splunkd)
[0x000000000098D692] _ZNK11TailWatcher19initializeFileStateER15WatchedTailFileRK8Pathname + 66 (splunkd)
[0x00000000009904B2] _ZN11TailWatcher11fileChangedEP16WatchedFileStateRK7Timeval + 242 (splunkd)
[0x0000000000EC2602] _ZN30FilesystemChangeInternalWorker15callFileChangedER7TimevalP16WatchedFileState + 114 (splunkd)
[0x0000000000EC3F90] _ZN30FilesystemChangeInternalWorker12when_expiredERy + 464 (splunkd)
[0x0000000000F53B2D] _ZN11TimeoutHeap18runExpiredTimeoutsER7Timeval + 301 (splunkd)
[0x0000000000EBD818] _ZN9EventLoop3runEv + 744 (splunkd)
[0x000000000098E9ED] _ZN11TailWatcher3runEv + 141 (splunkd)
[0x000000000099428A] _ZN13TailingThread4mainEv + 154 (splunkd)
[0x0000000000F5165E] _ZN6Thread8callMainEPv + 62 (splunkd)
[0x0000003E7CE079D1] ? (/lib64/libpthread.so.0)
[0x0000003E7CAE88FD] clone + 109 (/lib64/libc.so.6)
Linux / rtlvpxawsb.labcorp.com / 2.6.32-504.16.2.el6.x86_64 / #1 SMP Tue Mar 10 17:01:00 EDT 2015 / x86_64
Last few lines of stderr (may contain info on assertion failure, but also could be old):
2015-10-14 16:39:28.077 -0400 splunkd started (build 271043)
Conf mutator lockfile has disappeared; error condition possible.
2015-10-15 15:07:01.431 -0400 splunkd started (build 271043)
Conf mutator lockfile has disappeared; error condition possible.
2015-10-15 16:25:07.121 -0400 splunkd started (build 271043)
Conf mutator lockfile has disappeared; error condition possible.
2015-10-29 14:25:14.432 -0400 splunkd started (build 271043)
splunkd: /home/build/build-src/6.2.4/src/pipeline/input/Tailing.h:120: bool StatWrap::isDir() const: Assertion `_valid' failed.
2015-12-15 08:16:03.012 -0500 splunkd started (build 271043)
splunkd: /home/build/build-src/6.2.4/src/pipeline/input/Tailing.h:120: bool StatWrap::isDir() const: Assertion `_valid' failed.
/etc/redhat-release: Red Hat Enterprise Linux Server release 6.6 (Santiago)
glibc version: 2.12
glibc release: stable
Last errno: 2
Threads running: 30
argv: [splunkd -p 8089 start]
Thread: "MainTailingThread", did_join=0, ready_to_run=Y, main_thread=N
First 8 bytes of Thread token @0x7fd2b1c6d150:
00000000 00 a7 5f aa d2 7f 00 00 |.._.....|
00000008
First 512 bytes of Timeout object @0x7fd2aa5f9a88:
00000000 10 f2 6d 01 00 00 00 00 00 00 00 00 00 00 00 00 |..m.............|
00000010 38 98 5f aa d2 7f 00 00 00 00 00 00 00 00 00 00 |8._.............|
00000020 00 00 00 00 00 00 00 00 29 97 ad 56 00 00 00 00 |........)..V....|
00000030 3e 94 0d 00 00 00 00 00 00 00 00 00 00 00 00 00 |>...............|
00000040 80 9a 5f aa d2 7f 00 00 20 9c 5f aa d2 7f 00 00 |.._..... ._.....|
00000050 01 00 00 00 01 00 00 00 c0 a7 14 a9 d2 7f 00 00 |................|
00000060 80 d6 1c b0 d2 7f 00 00 c0 28 10 a9 d2 7f 00 00 |.........(......|
00000070 00 43 8a a9 d2 7f 00 00 00 9b 5f aa d2 7f 00 00 |.C........_.....|
00000080 00 9b 5f aa d2 7f 00 00 10 9b 5f aa d2 7f 00 00 |.._......._.....|
00000090 10 9b 5f aa d2 7f 00 00 20 9b 5f aa d2 7f 00 00 |.._..... ._.....|
000000a0 20 9b 5f aa d2 7f 00 00 c0 41 8a a9 d2 7f 00 00 | ._......A......|
000000b0 40 3f 8a a9 d2 7f 00 00 00 00 00 00 00 00 00 00 |@?..............|
000000c0 00 e0 0d ab d2 7f 00 00 14 10 00 00 00 00 00 00 |................|
000000d0 51 f3 4b 55 00 00 00 00 78 49 61 01 00 00 00 00 |Q.KU....xIa.....|
000000e0 d8 f6 16 b0 d2 7f 00 00 00 00 00 00 00 00 00 00 |................|
000000f0 00 00 00 00 00 00 00 00 80 4b 1f ab d2 7f 00 00 |.........K......|
00000100 10 4c 1f ab d2 7f 00 00 c0 4d 1f ab d2 7f 00 00 |.L.......M......|
00000110 0c 00 00 00 00 00 00 00 00 e4 1e b0 d2 7f 00 00 |................|
00000120 64 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |d...............|
00000130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000140 b8 9b 5f aa d2 7f 00 00 b8 9b 5f aa d2 7f 00 00 |.._......._.....|
00000150 00 00 00 00 00 00 00 00 80 63 1a b0 d2 7f 00 00 |.........c......|
00000160 e0 8d 82 af d2 7f 00 00 00 00 00 00 00 00 00 00 |................|
00000170 18 2e 45 b2 d2 7f 00 00 00 e4 1e b0 d2 7f 00 00 |..E.............|
00000180 00 48 0c ab d2 7f 00 00 30 48 0c ab d2 7f 00 00 |.H......0H......|
00000190 40 48 0c ab d2 7f 00 00 26 00 00 00 10 00 00 00 |@H......&.......|
000001a0 80 a7 14 a9 d2 7f 00 00 00 00 00 00 00 00 00 00 |................|
000001b0 88 9a 5f aa d2 7f 00 00 00 00 00 00 00 00 00 00 |.._.............|
000001c0 25 00 00 00 d2 7f 00 00 40 6b 2e b0 d2 7f 00 00 |%.......@k......|
000001d0 10 00 00 00 aa aa aa aa 00 00 00 00 00 00 00 00 |................|
000001e0 00 00 00 00 d2 7f 00 00 60 99 5f aa d2 7f 00 00 |........`._.....|
000001f0 00 a7 5f aa d2 7f 00 00 00 00 00 00 00 00 00 00 |.._.............|
00000200
FilesystemChangeWatcher: _timeoutActive=Y, _throttled=N, _waitingForNotifyCount=1
EMPTY Q: waitingForTimeout=N, noAction=N, stat=Y, immediateStat=Y, readdir=Y, notify=N
WatchedTailFile-WatchedFileState: path="/etc/httpd/logs/stage-phoenix.labcorp.com-ssl-request.log.5", flags=0x24023
First 144 bytes of PathnameStat @0x7fd2a98a4348:
00000000 30 2c 20 73 6f 75 72 63 65 50 6f 72 74 3d 38 30 |0, sourcePort=80|
00000010 38 39 2c 20 64 65 73 74 49 70 3d 31 30 2e 31 31 |89, destIp=10.11|
00000020 31 2e 31 2e 31 39 35 2c 20 64 65 73 74 50 6f 72 |1.1.195, destPor|
00000030 74 3d 39 39 39 37 2c 20 5f 74 63 70 5f 42 70 73 |t=9997, _tcp_Bps|
00000040 3d 32 34 31 39 2e 34 37 2c 20 5f 74 63 70 5f 4b |=2419.47, _tcp_K|
00000050 42 70 73 3d 32 2e 33 36 2c 20 5f 74 63 70 5f 61 |Bps=2.36, _tcp_a|
00000060 76 67 5f 74 68 72 75 70 75 74 3d 32 2e 33 36 2c |vg_thruput=2.36,|
00000070 20 5f 74 63 70 5f 4b 70 72 6f 63 65 73 73 65 64 | _tcp_Kprocessed|
00000080 3d 37 31 2c 20 5f 74 63 70 5f 65 70 73 3d 31 2e |=71, _tcp_eps=1.|
00000090
FilesystemChangeWatcher: _timeoutActive=Y, _throttled=N, _waitingForNotifyCount=1
EMPTY Q: waitingForTimeout=N, noAction=N, stat=Y, immediateStat=Y, readdir=Y, notify=N
Timeout: _when = 2321382613982983482.5641075399597568045, _initialMsec = 8247328199548096326
file-in: _initialized=Y, _lastCharWasNewline=N, _lastReadHadNulls=N, _wasCrcConflict=N, _warned=N
_nullsWarned=N, _wasTooNew=N, _exists=N, _noDebug=N
_hadExplicitSource=N, _crossedInitCrcLenBoundary=N, _classifiedAtLeastOnce=N, _fileReplaced=N, _readPathAfterRealEOF=N
_onlyNotifiedOnce=Y, _isArchive=N, _isCached=111213, _unowned=N, _deleteOnEOF=N
_overrideDeleteOnEOF=N, _doNotDeleteChildren=N, _readFromEnd=N, _readIrregardless=N
_fileCheckMethod=0, _crcSalt=<null>, _origPath=<null>
_bytesRead=0, _storingBytesRead=0, _initCrc=0x0, _seekCrc=0x0
_filenameCrc=0x16ab246dab3357c1, _fallbackCrc=0x0, _lastEOFTime=<zero>, _modTime=<zero>
_eofSeconds=3, _ignoreThresh=<zero>, _initCrcBytes=256, _initCrcForBatch=0x0
_pendingMetadata=<null>
_prevFd=-1, _pdModels=[0 PDs]
_rescheduleDelay=1000, _rescheduleTarget=<zero>, _name=/etc/httpd/logs/stage-phoenix.labcorp.com-ssl-request.log.5, _statusName=
_st=[dev=64773, ino=36, mode=100644, size=7204328, mtime=1453796447, owner=0, group=3000]
_toStringPrefix=state=0x0x7fd2a98a42c0, _backoff=0
_stdataInputHeaderProcessing=[]
_detectTrailingNulls=N, _detectReadingFromOffSet=N, _readAndSkipHeader=N, _uniqueId=439908
_rawPath=
x86 CPUID registers:
0: 0000000D 756E6547 6C65746E 49656E69
1: 000206D7 02010800 9E982203 0FABFBFF
2: 76035A01 00F0B2FF 00000000 00CA0000
3: 00000000 00000000 00000000 00000000
4: 00000000 00000000 00000000 00000000
5: 00000000 00000000 00000000 00000000
6: 00000077 00000002 00000009 00000000
7: 00000000 00000000 00000000 00000000
8: 00000000 00000000 00000000 00000000
9: 00000001 00000000 00000000 00000000
A: 07300401 0000007F 00000000 00000000
B: 00000000 00000000 000000FD 00000002
C: 00000000 00000000 00000000 00000000
D: 00000000 00000000 00000000 00000000
80000000: 80000008 00000000 00000000 00000000
80000001: 00000000 00000000 00000001 28100800
80000002: 20202020 49202020 6C65746E 20295228
80000003: 6E6F6558 20295228 20555043 342D3545
80000004: 20303436 20402030 30342E32 007A4847
80000005: 00000000 00000000 00000000 00000000
80000006: 00000000 00000000 01006040 00000000
80000007: 00000000 00000000 00000000 00000100
80000008: 00003028 00000000 00000000 00000000
terminating...
... View more