Getting Data In

Why does the Splunk Universal Forwarder 6.3.0 on Linux x86_64 server keep crashing?

Explorer

Splunk Universal Forwarder agent keeps crashing - Agent version 6.3.0 ...Server is Linux x86_64

crashlog updated:

[splunk@ftdcslsapp638 splunk]$ cat crash-2016-10-12-11:52:08.log
[build aa7d4b1ccb80] 2016-10-12 11:52:08
Received fatal signal 6 (Aborted).
 Cause:
   Signal sent by PID 5484 running under UID 5043.
 Crashing thread: tailreader0
 Registers:
    RIP:  [0x0000003B25A325E5] gsignal + 53 (/lib64/libc.so.6)
    RDI:  [0x000000000000156C]
    RSI:  [0x000000000000159E]
    RBP:  [0x000000000187DC28]
    RSP:  [0x00007F72847FE868]
    RAX:  [0x0000000000000000]
    RBX:  [0x00007F726E3FF000]
    RCX:  [0xFFFFFFFFFFFFFFFF]
    RDX:  [0x0000000000000006]
    R8:  [0xFEFEFEFEFEFEFEFF]
    R9:  [0x00007F7290F49F60]
    R10:  [0x0000000000000008]
    R11:  [0x0000000000000206]
    R12:  [0x0000000001813019]
    R13:  [0x000000000187EFE0]
    R14:  [0x00007F7288BD79B8]
    R15:  [0x00007F72847FEDE8]
    EFL:  [0x0000000000000206]
    TRAPNO:  [0x0000000000000000]
    ERR:  [0x0000000000000000]
    CSGSFS:  [0x0000000000000033]
    OLDMASK:  [0x0000000000000000]

OS: Linux
Arch: x86-64

 Backtrace:
  [0x0000003B25A325E5] gsignal + 53 (/lib64/libc.so.6)
  [0x0000003B25A33DC5] abort + 373 (/lib64/libc.so.6)
  [0x0000003B25A2B70E] ? (/lib64/libc.so.6)
  [0x0000003B25A2B7D0] __assert_perror_fail + 0 (/lib64/libc.so.6)
  [0x0000000000A503EA] ? (splunkd)
  [0x0000000000A4E6C3] _ZNK11TailWatcher12setupConfigsER15WatchedTailFile + 1507 (splunkd)
  [0x0000000000A4E7D2] _ZNK11TailWatcher19initializeFileStateER15WatchedTailFileRK8Pathname + 66 (splunkd)
  [0x0000000000A679F5] _ZN10TailReader10handleFileEP15WatchedTailFileP11TailWatcher + 69 (splunkd)
  [0x0000000000A6A2DA] _ZN12ReaderThread4mainEv + 378 (splunkd)
  [0x000000000109F0EE] _ZN6Thread8callMainEPv + 62 (splunkd)
  [0x0000003B25E07AA1] ? (/lib64/libpthread.so.0)
  [0x0000003B25AE8AAD] clone + 109 (/lib64/libc.so.6)
 Linux / ftdcslsapp638.ftiz.cummins.com / 2.6.32-642.1.1.el6.x86_64 / #1 SMP Fri May 6 14:54:05 EDT 2016 / x86_64
 Last few lines of stderr (may contain info on assertion failure, but also could be old):
    2016-10-11 16:01:49.002 -0500 splunkd started (build aa7d4b1ccb80)
    splunkd: /home/build/build-src/ember/src/pipeline/input/Tailing.h:178: bool StatWrap::isDir() const: Assertion `_valid' failed.
    2016-10-12 10:18:26.775 -0500 splunkd started (build aa7d4b1ccb80)
    2016-10-12 10:20:58.906 -0500 Interrupt signal received
    2016-10-12 10:21:16.753 -0500 splunkd started (build aa7d4b1ccb80)
    2016-10-12 10:25:49.955 -0500 Interrupt signal received
    2016-10-12 10:26:03.211 -0500 splunkd started (build aa7d4b1ccb80)
    2016-10-12 10:37:14.366 -0500 Interrupt signal received
    2016-10-12 10:37:31.114 -0500 splunkd started (build aa7d4b1ccb80)
    splunkd: /home/build/build-src/ember/src/pipeline/input/Tailing.h:178: bool StatWrap::isDir() const: Assertion `_valid' failed.

 /etc/redhat-release: Red Hat Enterprise Linux Server release 6.8 (Santiago)
 glibc version: 2.12
 glibc release: stable
Last errno: 2
Threads running: 35
Runtime: 4477.407956s
argv: [splunkd -p 8089 restart]
Thread: "tailreader0", did_join=0, ready_to_run=Y, main_thread=N
First 8 bytes of Thread token @0x7f72888ab610:
00000000  00 f7 7f 84 72 7f 00 00                           |....r...|
00000008
ReaderThread: mode=0, queueSize=89704, shutdown=N, reconfigure=N, mode=0
Reading File-WatchedTailFile-WatchedFileState: path="/opt/splunkforwarder/var/run/nmon/var/csv_repository/ftdcslsapp638_12_OCT_2016_101833_CPUnn_51942_20161012103734.nmon.csv", flags=0x24213
First 144 bytes of PathnameStat @0x7f727ab95608:
00000000  00 00 00 00 00 00 00 00  80 55 b9 7a 72 7f 00 00  |.........U.zr...|
00000010  00 00 00 00 00 00 00 00  a0 ba 1f 85 72 7f 00 00  |............r...|
00000020  00 b3 8a 88 72 7f 00 00  40 a2 f1 80 72 7f 00 00  |....r...@...r...|
00000030  00 00 00 00 00 00 00 00  30 a0 97 01 00 00 00 00  |........0.......|
00000040  00 57 b9 7a 72 7f 00 00  28 b8 1f 85 72 7f 00 00  |.W.zr...(...r...|
00000050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000060  77 5b fe 57 00 00 00 00  0f aa 09 00 00 00 00 00  |w[.W............|
00000070  00 00 00 00 00 00 00 00  50 bd 96 01 00 00 00 00  |........P.......|
00000080  00 00 00 00 00 00 00 00  01 00 00 00 00 00 00 00  |................|
00000090
FilesystemChangeWatcher: _timeoutActive=N, _throttled=N, _waitingForNotifyCount=89705
  EMPTY Q: waitingForTimeout=N, noAction=N, stat=Y, immediateStat=Y, readdir=Y, notify=Y
  Timeout: _when = 1476287351.633359, _initialMsec = 0
file-in: _initialized=Y, _lastCharWasNewline=N, _lastReadHadNulls=N, _wasCrcConflict=N, _warned=N
         _nullsWarned=N, _wasTooNew=N, _exists=N, _noDebug=N
         _hadExplicitSource=N, _crossedInitCrcLenBoundary=N, _classifiedAtLeastOnce=N, _fileReplaced=N, _readPathAfterRealEOF=N
         _onlyNotifiedOnce=Y, _isArchive=N, _isCached=111213, _unowned=N, _deleteOnEOF=N
         _overrideDeleteOnEOF=N, _doNotDeleteChildren=N, _readFromEnd=N, _readIrregardless=N
         _fileCheckMethod=0, _crcSalt=<null>, _origPath=<null>
         _bytesRead=0, _storingBytesRead=0, _initCrc=0x0, _seekCrc=0x0
         _filenameCrc=0xc260a7a429dae89a, _fallbackCrc=0x0, _lastEOFTime=<zero>, _modTime=<zero>
         _eofSeconds=3, _ignoreThresh=<zero>, _initCrcBytes=256, _initCrcForBatch=0x0
         _pendingMetadata=<null>
         _prevFd=-1, _pdModels=[0 PDs]
         _rescheduleDelay=1000, _rescheduleTarget=<zero>, _name=/opt/splunkforwarder/var/run/nmon/var/csv_repository/ftdcslsapp638_12_OCT_2016_101833_CPUnn_51942_20161012103734.nmon.csv, _statusName=
         _st=[dev=633625, ino=0, mode=0, size=0, mtime=0, owner=0, group=1]
         _toStringPrefix=state=0x0x7f727ab95580, _backoff=0
         _stdataInputHeaderProcessing=[]

         _detectTrailingNulls=N, _detectReadingFromOffSet=N, _readAndSkipHeader=N, _uniqueId=0
  _rawPath=

x86 CPUID registers:

         0: 0000000A 756E6547 6C65746E 49656E69
         1: 000006F1 06010800 80002201 0FABFBFF
         2: 76036301 00F0B2FF 00000000 00CA0000
         3: 00000000 00000000 00000000 00000000
         4: 00000000 00000000 00000000 00000000
         5: 00000000 00000000 00000000 00000000
         6: 00000077 00000002 00000009 00000000
         7: 00000000 00000000 00000000 00000000
         8: 00000000 00000000 00000000 00000000
         9: 00000000 00000000 00000000 00000000
         A: 07300401 0000007F 00000000 00000000
  80000000: 80000008 00000000 00000000 00000000
  80000001: 00000000 00000000 00000001 20100800
  80000002: 20202020 746E4920 52286C65 65582029
  80000003: 52286E6F 50432029 35452055 3536342D
  80000004: 76204C37 20402032 30342E32 007A4847
  80000005: 00000000 00000000 00000000 00000000
  80000006: 00000000 00000000 01006040 00000000
  80000007: 00000000 00000000 00000000 00000100
  80000008: 00003028 00000000 00000000 00000000
terminating...

Explorer

Check for Warnings in Splunkd logs, there might be a problem with ulimits on your Linux server. If there is a warning for ulimit values, change them to recommend values or to unlimite.

0 Karma